Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
doi: 10.3765/sp.3.1
Quantifiers in than-clauses∗
Sigrid Beck
University of Tübingen
Abstract
The paper reexamines the interpretations that quantifiers in than-clauses
give rise to. It develops an analysis that combines an interval semantics for
the than-clause with a standard semantics for the comparative operator. In
order to mediate between the two, interpretive mechanisms like maximality
and maximal informativity determine selection of a point from an interval.
The interval semantics allows local interpretation of the quantifier. Selection
predicts which interpretation this leads to. Cases in which the prediction
appears not to be met are explained via recourse to independently attested
external factors (e.g. the interpretive possibilities of indefinites). The goal
of the paper is to achieve coverage of the relevant data while maintaining a
simple semantics for the comparative. A secondary objective is to reexamine,
restructure and extend the set of data considered in connection with the
problem of quantifiers in than-clauses.
∗ Versions of this paper were presented at the workshop on covert variables in Tübingen 2006,
at two Semantic Network meetings (in Barcelona 2006 and Oslo 2007), at the 2009 Topics
in Semantics seminar at MIT, and at the Universität Frankfurt 2009. I would like to thank
the organizers Frank Richter and Uli Sauerland and the audiences at these presentations for
important feedback. Robert van Rooij and Jon Gajewski have exchanged ideas with me. The
B17 project of the SFB 441 has accompanied the work presented here — Remus Gergel, Stefan
Hofstetter, Sveta Krasikova, John Vanderelst — as have Arnim von Stechow and Irene Heim.
Several anonymous reviewers and Danny Fox have given feedback on earlier versions, and
David Beaver and Kai von Fintel have commented on the prefinal version. I am very grateful
to them all.
1 Introduction
Example (1) intuitively only has a reading that appears to give the universal
NP scope over the comparison, namely (10 a): all the girls were slower than
John. The reading in which the universal NP takes narrow scope relative to
the comparison is paraphrased in (10 b). Here we must look at degrees of
speed reached by all girls; depending on the precise semantics of the than-
clause (see below), this could mean the maximal speed that they all reached,
i.e. the speed of the slowest girl. Example (1) has no reading that compares
John’s speed to the speed of the slowest girl. Sentence (2), on the other hand,
only has a reading that gives the modal universal quantifier narrow scope
relative to the comparison, (20 b). That is, we consider the degrees of speed
that John reaches in all worlds compatible with the rules imposed by the
modal base of have to. This will yield the slowest permissible speed, and (2)
intuitively says that John’s actual speed exceeded this minimum requirement.
The sentence is not1 understood to mean that John did something that was
1 Heim (2006b) and Krasikova (2008) include a discussion of when readings like (20 a) are
available. The reading can be made more plausible with a suitable context, depending on the
modal chosen. For the moment I will stick to the simpler picture presented in the text. See
1:2
Quantifiers in than-clauses
against the rules — that is, reading (20 a), in which the modal takes scope over
the comparison, is not available.
We must ask ourselves how a quantifier contained in the than-clause can
have wide scope at all, why it cannot get narrow scope in (1), and why (2) is the
opposite. Since — as we will see in more detail below — these questions look
unanswerable under the standard analysis of comparatives, the researchers
cited above have been led to a revision of the semantic analysis of comparison.
Schwarzschild & Wilkinson (2002) employ an interval semantics for the than-
clause and give the comparative itself an interval semantics. Heim (2006b)
adopts intervals, but ultimately reduces the semantics of the comparison
back to a degree semantics through semantic reconstruction. This allows
her to retain a simple meaning of the comparative operator. A than-clause
internal operator derives the different readings that quantifiers in than-
clauses give rise to. The line of research in Gajewski 2008, van Rooij 2008
and Schwarzschild 2008 in turn adopts the idea of a than-clause internal
operator but not the intervals.
In this paper, I pursue a strategy that can be seen as an attempt to simplify
Schwarzschild & Wilkinson’s proposal. Like them, I derive a meaning for
the than-clause without a than-clause internal operator, and that meaning is
based on an interval semantics. But I combine this with a standard semantics
of the comparative in the spirit of von Stechow 1984. This means that the
end result of interpreting the than-clause must be a degree. Everything will
hinge on selecting the right degree, so that each of the relevant examples
receives the right interpretation.
In Section 2, I present the current state of our knowledge in this domain.
The analysis of than-clauses is presented in Section 3. Section 4 ends the
paper with a summary and some discussion of consequences of the proposed
analysis.
2 State of affairs
1:3
Sigrid Beck
The basis of our present perception of the problem presented by (1) and
(2) is the analysis of the comparative construction, because the data are
understood in terms of whether the quantfier appears to take wide scope
over the comparison according to a classical analysis of the comparative,
or whether it would have to be seen as taking narrow scope relative to the
comparison. My presentation assumes a general theoretical framework like
Heim & Kratzer 1998 and begins with specifically Heim’s (2001) version of
the theory of comparison promoted in von Stechow 1984 (see also Klein 1991
and Beck 2009 for an exposition and Cresswell 1977; Hellan 1981; Hoeksema
1983; Seuren 1978 for theoretical predecessors). This theory is what I will
refer to as a classical analysis of the comparative. For illustration, I discuss
the simple example (3a) below. In (3b) I provide the Logical Form and in (3c)
the truth conditions derived by compositional interpretation of that Logical
Form, plus paraphrase. Interpretation relies on the lexical entries of the
comparative morpheme and gradable adjectives as given in (4).
1:4
Quantifiers in than-clauses
Universal NPs are a standard example for an apparent wide scope quantifier
(see e.g. Heim 2006b). The sentence in (6) below only permits the reading in
(60 a), not the one in (60 b). This can be seen from the fact that the sentence
would be judged false in the situation depicted below.
_ _ _ _ _ • _ _ _ _ _ _ _ _ • _ _ _ _ _ _ _ _ • _ _ _ _ _ _ _ _ • _ _ _ _ _/
1:5
Sigrid Beck
The example with the differential in (7) shows the same behaviour (it uses a
version of the comparative that accomodates a difference degree, (7c)).
The problem posed by (5) and (7) is exacerbated in (8), as Schwarzschild &
Wilkinson (2002) observe. We have once more a universal quantifier, but
this time it is one that is taken to be immobile at LF: the intensional verb
predict. Still, the interpretation that is intuitively available looks to be one
in which the universal outscopes the comparison, (80 a). The interpretation
in which comparison takes scope over predict, (80 b), is not possible. This is
problematic because the LF we would expect (8) to have is (10), and (10) is
straightforwardly interpreted to yield (80 b).
1:6
Quantifiers in than-clauses
This is the interpretive behaviour of many quantified NPs, plural NPs like
the girls, quantificational adverbs, verbs of propositional attitude and some
modals (e.g. should, ought to, might). See Schwarzschild & Wilkinson 2002
and Heim 2006b for a more thorough empirical discussion.
1:7
Sigrid Beck
(13) He was coming through later than he had to if he were going to retain
the overall lead. (from Google, cited from Krasikova 2008)
= He was coming through too late.
This is the interpretive behaviour of some modals (e.g. need, have to, be
allowed, be required), some indefinites (especially NPIs) and disjunction (com-
pare once more Heim 2006b). It is also the behaviour of negation and negative
quantifiers, with the added observation that the apparent narrow scope read-
ing is one which often gives rise to undefinedness, hence unacceptability (von
Stechow 1984; Rullmann 1995). (That this is not invariably the case is shown
by (22), illustrating that we are concerned with a constraint on meaning rather
than form.)
1:8
Quantifiers in than-clauses
Here is how the empirical picture presents itself from the point of view of
a classical analysis of comparatives. It appears that there are two different
scope readings possible for quantifiers embedded inside the than-clause,
wide or narrow scope relative to the comparison. But there is usually no
ambiguity. Each individual quantifier favours at most one reading (negation
frequently permits none). Apparent narrow scope readings are straightfor-
wardly captured by the classical analysis. It is unclear how apparent wide
scope readings are to be derived at all. As Schwarzschild & Wilkinson argue,
they are beyond the reach of an LF analysis. It is also unclear what creates
the pattern in the readings that we have observed.
Before we examine modern approaches to this problem, a final comment
on the data. I have presented them the way they are presented in the literature
on the subject, as if they were all impeccable and their interpretations clear.
But I would like to use this opportunity to point out that I find some of
them fairly difficult and perhaps not even entirely acceptable. This concerns
example (6), for which I would much prefer a version with a definite plural (the
girls instead of every girl). The NP the girls is, if anything, more problematic
under the classical analysis, as Schwarzschild & Wilkinson (2002) point out
(having less of an inclination towards wide scope); but see Section 4 for a
comment on how this issue may be relevant for the analysis developed in
this paper.
Another instance are examples with intensional verbs like predict or expect;
when a genuine range is predicted or expected, intuitions regarding when
sentences with differentials like (800 ) would be true vs. false are not very firm.
This seems to me an area in which a proper empirical study might be helpful.
The issue is taken up in Section 3.4.
1:9
Sigrid Beck
(800 ) a. John is two inches taller than I had predicted (that he would be).
b. John arrived at most 10 minutes later than I had expected.
Since it is very hard to see how the data can be derived under the classical
theory, the two theories summarized below (Schwarzschild & Wilkinson 2002
and Heim 2006b) both change the semantics of the comparative construction
in ways that reanalyse scope. The quantificational element inside the than-
clause can take scope there even under the apparent wide scope reading.
The two theories differ with respect to the semantics they attribute to the
comparison itself. They also differ in their empirical coverage.
x1 x2 x3 C
To simplify, I will suppose that it is somehow ensured that we pick the right
matrix clause interval (Caroline’s height in (23), Joe’s height in the example
2 I present the discussion here in terms of the classical theory’s ontology, where degrees (type
d, elements of Dd ) are points on the degree scale and what I call an interval is a set of points,
type hd, ti.
1:10
Quantifiers in than-clauses
below).
Note that the quantifier is not given wide scope over the comparison at all
under this analysis. The interval idea allows us to interpret it within the
than-clause. While solving the puzzle of apparent wide scope operators, the
analysis makes wrong predictions for apparent narrow scope quantifiers (cf.
example (27)). The available reading cannot be accounted for ((28a) is the
semantics predicted by the classical analysis, corresponding to the intuitively
available reading; (28b) is the semantics that the Schwarzschild & Wilkinson
analysis predicts).
The breakthrough achieved by this analysis is that we can assign to the than-
clause a useful semantics while interpreting the quantifier inside that clause.
For this reason, the interval idea is to my mind a very important innovation.
The analysis still has a crucial problem in that it does not extend to the
apparent narrow scope quantifiers. That is, it fails in precisely those cases
that were unproblematic for the classical analysis. I will also mention that
the semantics of comparison becomes rather complex under this analysis,
since the comparative itself compares intervals. This is not in line with the
plot I outlined above of maintaining as the semantics of the comparative
operator the plain ‘larger than’-relation.
1:11
Sigrid Beck
Heim (2006b) adopts the interval analysis, but combines it with a scope
mechanism that derives ultimately a wide and a narrow scope reading of
a quantifier relative to a comparison. Her analysis extends proposals by
Larson (1988). Larson’s own analysis is only applicable to than-clauses with
an adjective phrase gap denoting a property of individuals — a limitation
remedied by Heim. Let us consider her analysis of apparent wide scope of
quantifier data, like (29), first. Heim’s LF for the sentence is given in (30). She
employs an operator Pi (Point to Interval, credited to Schwarzschild (2004)),
whose semantics is specified in (31). Compositional interpretation (once more
somewhat simplified for the matrix clause, for convenience) is given in (32).
1:12
Quantifiers in than-clauses
The than-clause provides intervals into which the height of every girl falls.
The whole sentence says that the degrees exceeded by John’s height is such
an interval. Semantic reconstruction (i.e. lambda conversion) simplifies the
whole to the claim intuitively made, that every girl is shorter than John. The
analysis assumes that the denotation domain Dd is a set of degree ‘points’,
and that intervals are of type Dhd,ti .
The analysis is a way of interpreting the quantifier inside the than-clause,
and deriving the apparent wide scope reading over the comparison through
giving the quantifier scope over the shift from degrees to intervals (the Pi
operator). It is applicable to other kinds of quantificational elements like
intensional verbs in the same way. Our example with predict is analysed
below; the intuitively plausible reading can now be derived straightforwardly
from the LF in (34).
1:13
Sigrid Beck
scope over the Pi operator, the resulting meaning of the whole sentence will
be one that lets the quantifier take scope over the comparison, even though
it is interpreted syntactically below the comparative operator and inside the
than-clause.
1:14
Quantifiers in than-clauses
1:15
Sigrid Beck
There is a group of new proposals — Gajewski 2008, van Rooij 2008 and
Schwarzschild 2008 — for how to deal with quantifiers in than-clauses whose
approach seems to be inspired by Heim’s (2006b) analysis. I present below a
simplified version of this family of approaches that is not entirely faithful
to any of them. I call this the NOT-theory. It can be summarized in relation
to the previous subsection as ‘keep the than-clause internal operator, but
not the intervals’. It adopts the idea that there is an operator — like Heim’s
Pi — that can take wide or narrow scope relative to a than-clause quantifier,
dictating what kind of reading the comparative sentence receives. It does not
adopt an interval analysis, and thus the operator is not Pi and the semantics
of the comparative is not the classical one. Instead, the operator is negation
and the proposed semantics is basically Seuren’s (1978).
1:16
Quantifiers in than-clauses
The authors mentioned above note that this semantics gives us an easy
way to derive the intuitively correct interpretation for apparent wide scope
quantifiers. This is illustrated below for the universal NP. In (46) I show that
the desired meaning is easily described in this analysis and in (47) I provide
the LF for the than-clause that derives it. (48) illustrates that some, another
apparent wide scope quantifier, is equally unproblematic.
_ _ _ _• _ _ _ _ _ _ _•_ _ _ _ •_ _ _ _ _ _ _ •
_ _ _ _ _ _ _ _ _ _ _/
g1 g2 g3 g4
...
(48) a. John is taller than some girl is.
b. ∃d[Height(J) ≥ d & ∃x[girl(x) & Height(x) < d]]
c. there is a girl who is shorter than John.
1:17
Sigrid Beck
The NOT-theory needs another important ingredient: Just like the Pi-operator
above, other than-clause internal operators have to take flexible scope relative
to NOT in order to create the different readings we observe. This is illustrated
below with the familiar have to example, and with allowed.
Just like the Pi-theory, then, the NOT-theory is able to generate the range
of readings we observe for operators in than-clauses. It seems somewhat
simpler than the Pi-theory in that it does not take recourse to intervals in
addition to a scopally flexible than-clause internal operator. But as in the case
of the Pi-theory, we must next ask ourselves what prevents the unavailable
readings, e.g. what excludes the LF in (52a).
The NOT-theory would have an empirical advantage over the Pi-theory if con-
straints on scope could be found to deal with the overgeneration problem we
noted above. A first successful application are polarity items. Example (53a)
can only have the LF in (54b), not the one in (54a), according to constraints on
the distribution of NPIs. Thus we only derive the approproate interpretation.
Note though that the Pi-theory has the same success since the scope of Pi
is a downward entailing environment, but the rest of the than-clause isn’t
(compare Heim 2006b). (55) is the mirror image.
1:18
Quantifiers in than-clauses
While this is helpful with modals, it stops short of explaining the interpreta-
tion associated with intensional full verbs like predict.
1:19
Sigrid Beck
Two further possible constraints are discussed. Van Rooij (2008) examines
universal DPs and Gajewski (2008) investigates numeral DPs. Let us consider
both in turn.
Note first that a universal DP is ambiguous relative to clause mate nega-
tion. In particular it allows a reading in which the universal takes narrow
scope relative to negation. Thus there are no inherent scope constraints that
would help us to exclude (630 b) as an LF of (63a). But exclude it we must,
since it gives rise to the unavailable reading (63c).
Van Rooij observes that (630 a) yields stronger truth conditions than (630 b). He
proposes that if no independent constraint excludes one of the LFs, you have
to pick the one that results in the stronger truth conditions. This amounts to
the suggestion that than-clauses fall within the realm of application of the
Strongest Meaning Hypothesis (SMH; Dalrymple, Kanazawa, Kim, Mchombo &
Peters 1998). If they do, the NOT-theory can make the desired predictions
about every DPs (and some other relevant examples). So could the Pi-theory,
though, so this does not distinguish between the two scope based theories of
quantifiers in than-clauses.
While I am sympathetic to the idea of extending application of the SMH, I
see some open questions for doing so in the case of than-clauses. Dalrymple
et al. originally proposed the SMH to deal with the interpretation of recipro-
cals. (64a) receives a stronger interpretation than (64b), for example, because
the predicate to stare at makes it factually impossible for the reading of (64a)
to ever be true. Similarly for (64c) vs. (64a,b). But (64a) only has one inter-
1:20
Quantifiers in than-clauses
pretation, the strongest one, and (64b) also cannot have a reading parallel to
(64c). The SMH says, very roughly, that out of the set of theoretically possible
interpretations you choose the strongest one that has a chance of resulting
in a true statement, i.e. that is conceptually possible.3
1:21
Sigrid Beck
b. (in an elevator:)
The second button from the bottom is higher than every other
button.
≠ the second button from the bottom is higher than the lowest
other button.
c. Friday is earlier than every other day of the week. ≠ Friday is
earlier than the latest other day of the week.
_ _ _ • _ _ _ _ • _ _ _ _ • _ _ _ _ _ _ _ • _ _ _ _ • _ _ _ _ • _ _ _ _/
g1 g2 g3 g4 g5
d H(J)
1:22
Quantifiers in than-clauses
value of its argument. The operator’s semantics is given in (67). The truth
conditions derived for the example are the right ones, as shown in (68) ((68)
uses Link’s (1983) operator ∗ for pluralization of the noun).
(69) EXACT [∃ [λd [John is d-tall] [than λd [threeF girls [λx [NOT x is
d-tall]]]]
(690 ) max(λn. ∃d[Height(J) ≥ d & for n girls x : Height(x) < d])
the largest number n such that John reaches a height that n girls
don’t is 3. = exactly three girls are shorter than John.
(70) EXACT [∃ [λd [John is d-tall] [than λd [NOT threeF girls [λx [x is
d-tall]]]]
(700 ) max(λn. ∃d[Height(J) ≥ d & NOT for n girls x : Height(x) ≥ d])
the largest number n such that there is a height John reaches and it’s
1:23
Sigrid Beck
The reasoning in (71) makes it clear that this reading leads to truth conditions
that do not correspond to an available reading; they would make the sentence
true in the situation depicted, where there are two girls shorter than John.
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _/
• • • • • •
g1 g2 g3 g4 g5
H(J)
The NOT-theory would have to come up with an explanation for why this
reading is unavailable. I am not aware that there is at present such an
explanation. Note that even if we didn’t have the reservations about the SMH
pointed out above, it would not apply here, as the two interpretations don’t
stand in an entailment relation.
To summarize: just like the Pi-theory, the NOT-theory faces an overgen-
eration problem. Both the Pi-theory and the NOT-theory solve this easily
regarding NPIs. The NOT-theory also has a simple story about modals and
negative quantifiers. It does not have an explanation for intensional full verbs
and numeral DPs, and I argue it does not have a story about universal DPs (or
other prospective applications of the SMH) either. Thus I see some progress
compared to the Pi-theory, but not a complete analysis. A conceptual advan-
tage seems to be the NOT-theory’s simplicity. But we will need to reexamine
that in the next subsection.
1:24
Quantifiers in than-clauses
200 interval
_ _ _ _ • _ _ _ _ _ _ _ _ _ _ _ _ _ _ • _ _ _ _/
H(B) H(J)
We combine with the differential next, as shown in (76). Then, the degree d is
bound and the usual semantic mechanisms combine this with the rest of the
main clause in (77). This derives (75).
1:25
Sigrid Beck
I conclude that while the type of analysis discussed in this section — what
one might call scopal theories of quantifiers in than-clauses — has brought
forth some very interesting ideas, there are also unanswered questions. It
may be worthwhile to pursue a scopeless alternative, which is what I will do
in the next section.
3 Analysis: Selection
1:26
Quantifiers in than-clauses
I illustrate the idea behind the selection analysis with example (79), which
would not in fact require intervals at all of course. But, suppose that we in
general compositionally derive as the meaning of the than-clause a set of
1:27
Sigrid Beck
H(B) _
Intervals
containing
.. Bill’s height
.
_
In the present case, the could be an operator selecting the shortest interval
from the set, i.e. Bill’s height, cf. (82). This seems a natural choice, given that
all other intervals contain extraneous material and that the point that really
‘counts’ is just Bill’s height.
1:28
Quantifiers in than-clauses
Irene Heim and Danny Fox (p.c.) point out to me that the sense in which
choosing the minimal interval is ‘natural’ is informativity. (83) below states
what the maximally informative propositions out of a set of true propositions
(say, a question meaning) are.
{that John drove 50 mph, that John drove 49 mph, that John
drove 48 mph, . . . }
b. How much flour is sufficient?
λw.λp.∃d[p(w)&p = λw 0 .d-much flour is sufficient in w 0 ]
{that 500 g is sufficient, that 501 g is sufficient, that 502 g is
sufficient, . . . }
The definition can be extended to (intensions of) arbitrary sets in the following
way:
1:29
Sigrid Beck
Fox & Hackl (2006) argue that we want to extend the definition from the
question case to others in order to capture the similarity between (84a,b)
above and (87a), (88a). (87a) refers to the maximum speed John reached and
(88a) refers to the minimum amount that suffices, both maximally informative
in the sense of (85). The instance in (86) extends the analogy from (84a,b)
and (87a), (88a) to (87b), (88b).
Hence, the in (7900 ) is m_inf, which yields a singleton, combined with taking
from a set its only member (here represented with max). We can understand
these operators as semantic ‘glue’ (a term introduced by Partee 1984, see
also von Stechow 1995): operations that have to enter into composition, in
addition to what the syntax strictly speaking provides, in order to make
the sentence parts combinable. Their presence is required by the need for
interpretability.
Let’s return to the now familiar example (89). We take the than-clause to have
the denotation in (890 ).
1:30
Quantifiers in than-clauses
interval
into which the height of every girl falls
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _/
• • • •
x1 x2 x3 J
(91) and (92) below provide the relevant definitions. We extend the notion of
the ordering relation underlying our degree scale from degrees to intervals,
(91). We can then define the maximal element of a set of intervals, and finally
the end point of an interval, (92).
1:31
Sigrid Beck
What I call selection yields the maximum relative to the ordering relation
linguistically given — ‘larger than’ on the size scale in the case of taller. This
follows from more general interpretive mechanisms suggested independently
(compare Jacobson 1995; Fox & Hackl 2006). Application of these mecha-
nisms is required by the need for the than-clause to serve as input to the
comparative operator.
We can apply the same strategy to narrow scope existentials. This is illus-
trated with (94) below. In contrast to Heim’s analysis and like Schwarzschild
& Wilkinson’s, I assume that the than-clause denotes the set of intervals in
(940 ) (once more via the shifted lexical entry for the adjective, (80)). Impor-
tantly, remember that I assume that the shift to intervals must take place
locally, i.e. within the adjective phrase. I do not assume a genuine mobile op-
erator Pi like Heim (2006b) does (whose LF for (94a) would give Pi wide scope
relative to anyone). We dispense with the interpretations for than-clauses
that were attributed to wide scope of the Pi operator.
1:32
Quantifiers in than-clauses
(95) _ _ _ _• _ _ _ _• _ _ _ _ •_ _ _ _ _ _ _ •
_ _ _ _ _ _ _/
x1 x2 x3 M
The selection strategy predicts the right truth conditions for these ‘apparent
narrow scope’ and ‘apparent wide scope’ quantifier data without changing
scope. This allows us to predict ungrammaticality of negation straightfor-
wardly, as illustrated below.
3.1.3 Negation
(960 ) will not yield a well-defined meaning for the comparative. Just as in
the original analysis of these data, the than-clause will not provide us with a
maximum, since there is no largest interval containing no girl’s height. Max>
is undefined; hence negation in the than-clause leads to undefinedness of the
comparative as a whole. Since there is no other option, we no longer face
the problem of ruling out the apparent wide scope reading of the negative
quantifier.
The simple data discussed in this subsection highlight the potential attrac-
tion of the selection analysis. We keep a simple semantics for the comparative
1:33
Sigrid Beck
This subsection concerns universal quantifiers that do not behave like every
girl, predict and other apparent wide scope universals. Remember from Sec-
tion 2 that modals like have to appear to favour a narrow scope interpretation
rather than the apparent wide scope interpretation described and derived
above for other universals.
(97) Mary wants to play basketball. The school rules require all players to
be at least 1.70 m.
(970 ) a. Mary is taller than she has to be.
b. Mary’s actual height exceeds the degree of tallness which she has
in all worlds compatible with the school rules;
i.e. Mary’s actual height exceeds the required minimum, 1.70 m.
1:34
Quantifiers in than-clauses
There are two analyses, as far as I am aware, that propose to reduce the
variation in the interpretation of than-clauses with universal modals between
maximum and minimum interpretation to independent factors, such that
the readings collapse into one. Meier (2002) proposes that the ordering
source that modal semantics uses is responsible for a contextually guided
determination of the interpretation, explaining away apparent maxima and
minima both. Krasikova (2008) examines the problem of have to–type modals
in comparatives in particular and employs covert exhaustification to explain
away apparent ‘more than minimum’ interpretations. While both approaches
solve the problem at hand equally well for my purposes, I describe below
Krasikova’s suggestions because they seem to me to offer more promise for
identifying which modal operators give rise to which reading(s).
Krasikova (2008) points out that whether we get a ‘more than minimum’
reading like the one illustrated above for this type of modal or a ‘more than
maximum’ reading parallel to the reading illustrated for predict depends on
the context an individual example is put into. Remember example (99) from
above, which shows that have to–type modals may also give rise to a ‘more
than maximum’ reading — the reading we expect under the present analysis.5
Thus what distinguishes have to–type modals from others is the availability
of an apparent narrow scope reading (a ‘more than minimum’ reading under
the present perspective).
(99) He was coming through later than he had to if he were going to retain
the overall lead. (from Google, cited from Krasikova 2008)
Krasikova further observes that the universal modals that can give rise to
the ‘more than minimum’/apparent narrow scope reading are just the ones
that occur in sufficiency modal constructions (SMC). An example of an SMC
is given below (von Fintel & Iatridou 2005).
(100) You only have to go to the North End (to get good cheese).
5 It is not at present clear to me under what circumstances a have to–type modal seems
to permit a more-than-maximum interpretation. Relevant factors may be the choice of a
negative polar adjective and a subjunctive-like interpretation (Danny Fox and Irene Heim,
p.c.). Personally, I find this interpretation very hard to get.
1:35
Sigrid Beck
The combination of only and a modal in the SMC considers alternatives to the
proposition that is the complement of have to, and ranks those alternatives
on a scale. Plausible alternatives for our example and their ranking are given
in (101). They provide the domain of quantification, C in (102); (102a) sketches
a structure for the example, (102b) a meaning for ‘only have to’ and (102c)
the outcome, which corresponds to the desired truth conditions (1000 ). Note
that the SMC reading is one that identifies the point on a scale that is the
minimum sufficiency point, as illustrated in (103).
(101) a. that you go to the nearest supermarket, that you go to the North
End, that you go to New York, that you go to Italy
b. SUPER < NE < NY < Italy (where ‘<’ means: is easier than)
(102) a. [[only have to]C,< [you go to the North End]]
b. [only have to]C,< (p)(w) = 1 iff
∀q[q ∈ g(C)&¬(q < p) → ¬have to(q)(w)]
c. For all q such that q is in g(C) and ¬(q < NE) : ¬have to(q)
(103) _ _ _ _ _ _ _ _ _ _ _ _ • _ _ _ _ _ _ _ _ _ _ _ _/
necessary not necessary
NE
My sketch leaves unaddressed all the thorny problems of the SMC construc-
tion like the composition of only and have to with the rest of the clause,
and the problem of only’s presupposition; compare in particular von Fintel &
Iatridou 2005 and Krasikova & Zhechev 2006. What is important for present
purposes is Krasikova’s observation that the interpretation that have to–type
modals give rise to in than-clauses can be seen as an SMC interpretation. The
‘more than minimum’ interpretation just like the SMC identifies the point
on a scale that is the minimum sufficiency point. Whatever is a plausible
analysis of the SMC should be extendable to the problem at hand.
(104) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _/
necessary
• not necessary
1.70 m
1:36
Quantifiers in than-clauses
Krasikova suggests that have to–type modals can use Fox’s (2007) covert
exhaustivity operator EXH instead of only, whose meanings are basically the
same. This is what happens in our comparatives, and this is responsible for
the ‘more than minimum’ interpretition.6 A structure for the than-clause
of (970 a) is given in (105a). Its interpretation using (102b) is (105b). Suppose
now that the relevant alternatives are the propositions in (106a), which place
Mary’s height in varying intervals. Our context is such that difficulties arise
with respect to reaching a certain height. Being short is not hard, being
tall is difficult. Thus the ordering of the alternatives in (106a) is one that
ranks them according to the height of the interval on the tallness scale into
which Mary’s height falls. The requirement easiest to meet is the minimal
compliance height. Given this, (105b) can be paraphrased as (105c).
Applying maximal informativity as usual yields the meaning below for the
subordinate clause, the minimum ‘point’ as desired. Selection with Max>
is trivial; the resulting meaning is that Mary’s actual height exceeds the
minimum compliance height.
6 As an anonymous reviewer points out, this raises the question of why we cannot have an
overt only in such sentences, cf. the ungrammaticality of (ia). The editors point out that
extraction of the associate of only is not good, cf. (ib). This would have to be different for
EXH than for only in order to answer the reviewer’s question.
1:37
Sigrid Beck
(107) m_inf([λD. nothing more difficult is required than for Mary’s height
to fall within D])
= {the minimum compliance height}
= {[1.70–1.70]}
SMC readings of have to–type modals explain the ‘more than minimum’
reading that they can give rise to in comparative than-clauses with the single
assumption that EXH takes the place of only. Internal to the subordinate
clause, exhaustification occurs. Exhaustification of the than-clause reduces
the than-clause interval to a point. The ‘point’ that exhaustification yields is
the minimum compliance height.
I follow Krasikova in making the connection between SMC use and ‘more
than minimum’ readings and in her analysis in terms of exhaustification. This
allows me to maintain the selection analysis from the previous subsection.
According to this analysis, have to–type modals don’t require any revision of
the semantics of comparative constructions. We need to take into account
the special semantics of SMC modals instead. Contrary to appearances,
we uniformly select a degree from an interval via Max> ; with have to, we
may apply Max> after exhaustification. This gives rise to a ‘more than
minimum’/apparent narrow scope reading. If exhaustification does not apply,
we get the regular ‘more than maximum’ = apparent wide scope reading (cf.
example (99) above). Modals that do not permit an SMC reading do not permit
a ‘more than minimum’ reading either, because the ‘more than minimum’
reading is an SMC reading. I refer the reader to Krasikova 2008 for further
discussion. Crucially for present purposes the correlation with SMC use
provides an independent criterion for when to expect which reading. The
contrast between the different kinds of universal quantifiers is not analysed
as a scope effect. The analysis argued for here makes the interpetation of
have to–type modals a property of those particular lexical items. They are
the only apparent narrow scope items requiring special attention since in
contrast to the scope analysis’ procedure, apparent narrow scope existentials
have already been taken care of.
This section concerns existential quantifiers that do not behave like NPI any
and other apparent narrow scope existentials. The problem for the selection
1:38
Quantifiers in than-clauses
The intuitively available interpretation (1080 a) looks once more like a straight-
forward wide scope reading of the numeral quantifier. Application of the
selection strategy predicts an interpretation that is unavailable, (1080 b), as
illustrated below.
_ _ •_ _ _•_ _ _• _ _ _ _ •_ _ _•_ _ _• _ _ _• _ _ •_ _ _ _ _ _/
c1 c2 c3 c4 c5 c6 c7 c8
Max>
We face the combined challenge of (i) predicting the right interpretation and
(ii) not predicting the non-existing one. I propose to tackle this problem
through a more thorough analysis of numeral NPs. We will first consider
indefinite NPs in the context of than-clauses and then move on to numerals
and example (108).
1:39
Sigrid Beck
Furthermore, I will assume that indefinite NPs, e.g. with German ein (‘a’),
are ambiguous between the ‘normal’ interpretation ‘∃x’ (existential quan-
tification over individuals) and the ‘specific’ interpretation ‘∃f ’ (existential
quantification over choice functions). Below I provide a selection analysis
of the two readings of (111) under those assumptions.7 On this analysis,
the apparent narrow scope reading amounts to a ‘∃x’ interpretation and
7 I use the German example because the larger English inventory of indefinites makes it hard
for me to determine which examples are genuinely ambiguous.
1:40
Quantifiers in than-clauses
the apparent wide scope reading amounts to a ‘∃f ’ interpretation for the
indefinite.
I further assume that the usual factors (in particular, the nature of the
indefinite and what readings the sentence context permits) decide when we
can get which reading(s) of a singular indefinite. I have nothing illuminating
to say about the particulars of this; note, however, that I do assume that
apparent narrow scope readings are possible with indefinites/existentials
other than NPIs. My intuitions regarding German indefinites like jemand
(someone) + anders/sonst (other/else), wh-word + other/else convince me
of this in particular, because these indefinites are not, I believe, plausibly
analysed as polarity items, nor are they plausibly analysed as generic (hence
not existential). Other languages’ inventory of indefinites may make my view
of what the interpretive possibilities of existentials in than-clauses are appear
less obvious. I am grateful in particular to Sveta Krasikova for discussion of
this point.
1:41
Sigrid Beck
Also, the data in (117) (in addition to (111) above) provide an indefinite, ein
anderer (‘another’), that is ambiguous. Both (117a) and (117b) were collected
informally from the web. Context makes it clear that (117a) is intended to
mean ‘faster than everyone else’ and (117b) is intended to mean that someone
was slower.
(117) a. Wir denken 7-mal schneller, als ein anderer reden kann.
we think 7 times faster than an other talk can
‘We think seven times faster than anyone else can talk.’
b. Die meisten überholten mich, aber ab und zu war ich auch
the most passed me but now and then was I also
mal schneller als ein anderer.
once faster than an other
‘Most people passed me, but now and then I was faster than
someone.’
The version with the singular indefinite can have an apparent narrow scope or
an apparent wide scope interpretation (with some speaker variation regarding
which interpretation is favoured). It is known that bare plurals prefer narrow
scope interpretations — let’s say this implies that the choice function ‘∃f ’
interpretation is dispreferred. What the oddness of the plural data tells us,
then, is that there is something unexpectedly wrong with the non-specific ‘∃X’
interpretation of the plural indefinite (I write capital ‘X’ to indicate plurality,
in contrast to ‘x’ for singular). Note that the data (118)–(120) improve when
some or several/einige is added to the plural indefinite. They then have an
apparent wide scope or ‘∃f ’ interpretation. The following generalization
emerges:
Why should a plural indefinite sound odd unless it can easily reveice a
specific interpretation? The generalization is intuitively unsurprising once
we examine the ‘∃X’ interpretation more closely. Careful consideration as to
what it would mean in the case of (120), provided in (122a), reveals that (given
that there is more than one sister of Greg’s) it would be true iff the sentence
with the singular ‘∃x’ (’any sister of Greg’s) would be true. I suggest that
this makes the interpretation (122a) somehow inappropriate for the example.
Perhaps this can be seen as a matter of economy: the plural has no purpose,
1:43
Sigrid Beck
(123) is a first shot at what the relevant constraint might effect. The reading
that survives, (122b), is one in which, compared to the corresponding singular
indefinite, the plural serves a purpose.
1:44
Quantifiers in than-clauses
The following three sets of data replace the proper name in (124) with various
kinds of indefinites in the three constructions. The plain singular indefinite
is fine and picks out the fastest speed in the definite description and the
question as well as in the than-clause — in addition to a possible specific
reading. The bare plurals are somewhat odd, which we can explain if a
constraint like the BUMP above is operative (and the ‘∃f ’ interpretation is
dispreferred). The last set with plural some indefinites are fine and have the
specific reading. Plural indefinites with some are different from bare plurals
in easily allowing an ‘∃f ’ interpretation.
These data share the problem of having to determine unique reference from
a set via maximality/informativity. They motivate the way that the BUMP
is phrased above. Perhaps it is the nature of maximality/informativity as
‘glue’ that makes it sensitive to such a constraint: the step of postulating
such operators is an inference one draws to have things make sense, and
such inferences are subject to ‘making sense’-type of requirements like the
BUMP. But I hasten to add that I am by no means confident that I understand
what is at stake and that more work ought to be done in figuring out what
the BUMP is really about.
I conclude this subsection with a couple of comments on further kinds
of indefinites. The first data point confirms the perspective on the data
developed so far with the German example (128), where the obligatorily
weak lauter (several/many) sounds very strange. Only einige (several) is
acceptable, under an apparent wide scope reading.
1:45
Sigrid Beck
(130) a. John solved this problem faster than any girl did.
b. ??John solved this problem faster than any girls did.
c. John solved this problem faster than any of the girls did.
I don’t understand why some people judge (130b) to be fine; I wonder whether
a Free Choice interpretation of any girls is possible for those who accept the
sentence.
A final remark: it is not the case that plural indefinites in than-clauses
are generally bad, not even narrow scope ones. The data in (131) embed the
indefinite beneath another operator, and the BUMP does not apply.
1:46
Quantifiers in than-clauses
3.3.2 Numerals
x1 x2 x3
1:47
Sigrid Beck
Krifka 1999 on the semantics of such NPs). Remember the simple example
(66) and its analysis.
This step does not immediately solve our problem. If we give the than-clause
in (108) the semantics in (132), nothing changes: we still compare with the
tallest of John’s classmates, as long as there are at least five. Notice, however,
that this interpretation is just as strange as the plain plural indefinite ‘∃X’
interpretation above, since the number information serves no real purpose
for the truth conditions.
This reading is thus ruled out by the same constraint BUMP. We should then
alternatively consider a choice function analysis of the indefinite ‘n class-
1:48
Quantifiers in than-clauses
mates’. I combine this below with the assumption that exactly is evaluated in
the matrix clause. In (134), we derive the desired interpretation.
(135) a. [EXACT [John is taller [than Max> m_inf [(exactly) 5f of his class-
mates are tall]]]]
b. Out of all the alternatives of the form ‘John is taller than n of his
classmates are’, the most informative true one is ‘John is taller
than 5 of his classmates are’.
Thus I suggest that a proper semantic analysis of numeral NPs makes the
facts compatible with a selection solution after all.
1:49
Sigrid Beck
I will make further use of the semantics developed by Hackl (2001a,b, 2009)
for these NPs, according to which ‘many N’ is an indefinite NP including a
gradable adjective in the positive form, and ‘most N’ is correspondingly a
superlative. This makes feasible analyses that can be paraphrased in the
following way:10
(1370 ) John is taller than the tallest of the many-membered group of class-
mates of his selected by f (f a choice function).
(1380 ) John is taller than the tallest of the group selected by f , which
comprises a majority of his classmates (f a choice function).
More detailed analysis are given below ((139) provides the two potential
readings of (137) and (140)–(142) analyse (138)). Besides being able to predict
the existing readings, the BUMP constraint in (123) will rule out the ones that
are intuitively unavailable.
1:50
Quantifiers in than-clauses
To sum up: this subsection has analysed the available vs. unavailable readings
of indefinite NPs in than-clauses using a choice function mechanism plus a
constraint on unmotivated pluralization. The formulation of the BUMP in
(123) is offered as a first version of the constraint we need; what we want to
derive is that it is strange to say ‘John is taller than exactly three girls are’ if
we meant, and might as well have said ‘John is taller than any girl is’. Since
this seems eminently reasonable, I am hopeful that a good way of stating
the relevant constraint exists. Given this, the present section has extended
the selection analysis to apparent wide scope indefinite NPs of various kinds
(including numerals, many and most), using a pseudoscope mechanism
argued for extensively for indefinites independently of comparatives. The
comparative semantics itself remains simple.
The final kind of data that does not immediately fall out from the selection
analysis is represented by example (143) below: a than-clause containing a
universal quantifier in combination with a differential.
1:51
Sigrid Beck
(144) [[than every girl is tall] [5 [John is exactly 200 taller t5 ]]]
(1440 ) [than [1 [every girl [2 [[Pi t1 ] [3 [t2 is t3 tall]]]]]]] =
λD 0 . ∀x[girl(x) → Height(x) ∈ D 0 ]
intervals into which the height of every girl falls
(145) (144) = [λD 0 . ∀x[girl(x) → Height(x) ∈ D 0 ](λd. John is exactly 200 taller than d)
= for every girl x: John is exactly 200 taller than x
1:52
Quantifiers in than-clauses
To sum up the picture so far, differentials with exactly and at most, and
perhaps simple differentials, seem to be problematic for the selection analysis
as opposed to the scope analysis.
However, there is more to say about this issue empirically and theoreti-
cally. Beginning with the theoretical side, note that the interpretation of the
matrix clause in (144) was simplified in terms of not giving the differential
quantifier exactly 200 independent scope.11 Data like (150) show that such
expressions do take scope, however:
Hence, in addition to (a more elaborate version of) (144) above, the LF and
interpretation in (152) become possible. For the Pi theory, this leads to
availability of the analysis in (153).
(152) [[exactly 200 ] [4 [[than every girl is tall] [5 [John is t4 taller t5 ]]]]]
(1440 ) [than [1 [every girl [2 [[Pi t1 ] [3 [t2 is t3 tall]]]]]]] = λD 0 . ∀x[girl(x) →
Height(x) ∈ D 0 ]
intervals into which the height of every girl falls
(153) (152) = [exactly 200 ](λd0 . [λD 0 . ∀x[girl(x) → Height(x) ∈ D 0 ]
(λd. John is d0 taller than d)
= [exactly 200 ](λd0 . for every girl x: John is d0 taller than x)
= max(λd0 . for every girl x: John is d0 taller than x) = 200
‘The largest amount that John is taller than every girl is 200 .’
11 Thanks to Danny Fox for drawing my attention to this point.
1:53
Sigrid Beck
Note that this LF no longer predicts all the girls to have the same height. It
says that John is exactly 200 taller than the tallest girl — just like the selection
analysis. It is thus not clear that the predictions of the scope analysis are
really different from, and superior to, the selection analysis.
Next, let’s take a closer look at the data. Above, we identified as a problem
that EQ is not predicted, the assumption that all individuals universally
quantified over have the same height (or whatever the gradable predicate
measures). However, the data are quite difficult. While I agree with the
perception in the literature that in (143a) the EQ is plausible, it is clear that it
does not always arise. Below are some examples where it doesn’t; (154)–(156)
are collected from the internet.12 The reader can convince her/himself that
further relevant data can easily be found. The difficulty in determining the
interpretation of data with nominal universal quantifiers is related to the
point mentioned in Section 2 about differentials and intensional verbs. I
mention in (1560 ) a suggestive example also collected from the web.
(154) Aden had the camera for $100 less than everyone else in town was
charging.
(155) WOW! Almost 4 seconds faster than everyone else, and a 9 second
gap on Lance.
(156) Jones was almost an inch taller than the both of them. (the both
of them = John Lennon and Paul McCartney, Jones = Tom Jones.
The author thinks that Jones was 50 1100 and that Paul McCartney was
about 50 1000 . John Lennon is reported to be shorter than McCartney
by about an inch.)
(1560 ) I finished 30 seconds faster than I expected. [. . . ] I know my 300
yard time more accurately now.
(the continuation suggests that the speaker’s expectation was a
range rather than a precise point in time.)
1:54
Quantifiers in than-clauses
distance between that and the main clause degree. This is demonstrated for
(155) below.
We face the task of figuring out what distinguishes (143) from (154)–(156),
i.e. why EQ arises in some data but not all. I would like to ask this question
in terms of how the selection analysis might predict not only (154)–(156),
but also (143). To this effect, let’s take a closer look at the combination of a
differential with a comparative.
Note that we understand a claim like (157a) relative to a plausible level of
granularity. For us to judge (157a) to be true, it is in most contexts sufficient
to be precise up to the level of a few millimeters. Suppose on the other hand
that (157b) is about a sensitive piece of machinery. A one millimeter margin
could very well not be acceptable. This means that what we call John’s height,
or that rod’s length, is actually somewhat fuzzy: it is a ‘blob’ or an interval
on the relevant scale whose size depends on context. The sensitivity to a
level of precision is not represented in the standard truth conditions of the
two examples given in (1570 ).
To capture this, I follow Krifka (2007) in assuming that a scale can be divided
into different units. A unit on the scale then has to be identified that can
count as a ‘point’ at the contextually relevant level of granularity. Which
(i) a. Ben was almost a year older than everyone else in his class (because he had just
missed the deadline for the previous school year).
b. #For all x ≠ Ben: Ben was almost a year older than x.
c. #Ben was almost a year older than the next oldest in his class.
d. ?The others’ ages center around a point almost a year younger than Ben.
1:55
Sigrid Beck
(158) . . . _ _ _ _ •_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _/
1.80 m
... ...
5 cm 5 cm
... ...
... 2 cm
2 cm
...
1 cm 1 cm
(159) Let hS, >i be a scale. Then Cov is a cover of S if Cov is a set of
subsets of S such that each d in S is in some set in Cov, each set in
Cov is contiguous and no two sets in Cov overlap. Assume Cov to
be the set of intervals that are of the contextually relevant size.
I furthermore revise the definition of an end “point” from (160) to (161) ((161b)
is the informal version, (161c) the more precise version employing covers).
Note that the distinction between points and intervals dissolves under this
view because what we usually call a point is an interval on the scale whose
size depends on context.
1:56
Quantifiers in than-clauses
Supposing that we talk about what we roughly call 1.80 m, the meanings of
our two than-clauses could (depending on context, i.e. the relevant cover)
come out as in (162). It is thus in the nature of scales that they have a
part/whole structure whose units are determined in a context dependent
manner.
1:57
Sigrid Beck
granularity relevant for the than-clause has to make sense in relation to the
differential.
The reasoning works out given that the cover, and therefore the unit that
counts as ‘maximal point’, is determined locally, i.e. than-clause internally,
independently of the differential which will then either fit or clash.14
I think that granularity offers an explanation for the interpretive effect I
call EQ. Consider the situation depicted below for (165). If we have no further
information regarding the situation, the girls’ sizes can be far apart. This
would indicate a large interval. The idea is that the semantics of the than-
clause itself indicates possible Covers. There is then a danger that we have a
coarse-grained cover. A reasonable division of x1 –x5 would be
into relatively long units, hence Max> is long. This would be incompatible
with the differential — a granularity clash. That is, a sentence in which the
than-clause indicates a real spread (e.g. because of a universal quantifier)
brings with it the danger of a granularity mismatch with a differential.
x1 x2 x3 x4 x5 J
m_inf((than) every girl is tall) = { x1 –x5 }
14 A similar effect can be observed with Covers in the plural domain in examples like (i) below.
Suppose we are talking about Angelina and Reginald Johnson and Mary and John Smith.
Then the two subjects in (ia) and (ib) refer to the same group, but make different covers
salient (Schwarzschild 1996). By virtue of the cover suggested by the subject, (ia) tends to be
understood as ‘the women love their child and the men love their child’, which is unexpected.
(ia) amounts to ‘the Smiths love their child and the Johnsons love their child’, which is more
expected. The point is that the subject group autonomously makes salient a cover, whether
this leads to a plausible interpretation of the whole or not.
1:58
Quantifiers in than-clauses
The Cover indicated by the than-clause may agree with the differential
only under an additional assumption of closeness of the individual “points”
covered by the than-clause interval. My suggestion is that if a potential
granularity clash could only be avoided under an additional assumption of
closeness, one tends to assume equality and a default Cover of the than-
clause interval D in terms of the singleton set {D}. This is the EQ. In short,
without an informative context, there is a danger of a granularity clash. The
danger is avoided by the EQ. The EQ would under this analysis be an extra
assumptions speakers make in order to ensure that a sentence is meaningful.
(Note that the EQ is not the weakest assumption one could make to ensure
that; perhaps it is the simplest assumption.)
The data above for which the selection analysis automatically makes
good predictions with Max> , (154)–(156), are such that we have a rather clear
expectation about the kind of interval denoted by the than-clause — the range
within which the individual degrees fall is fixed. The context is rich, and
no problems with granularity arise. Thus a genuine Max> interpretation (i.e.
one in which we pick out the maximum from a genuine spread) is possible
without further assumptions. This distinguishes those data from our original
example (165). I suggest that danger of a granularity clash leads to EQ: to
supposing that the ‘points’ that are in danger of being spread over too large
an interval in fact collapse into one. We expect that it should depend on
the amount of information available on the interval covered by the than-
clause whether we get an EQ interpretation or a genuine Max> interpretation.
Additional information to the effect that the points are not the same, but
close enough together for the purposes of the differential, may make the EQ
unnecessary and thus make a genuine Max> interpretation possible for our
EQ data. This appears to me to be correct:
1:59
Sigrid Beck
and place them into a less fortunate context, and trigger EQ. Again, this
seems the right prediction.
(169) a. This pot dries out exactly 40 min faster than all the others.
(EQ likely)
b. This T-Shirt dries exactly 20 min faster than all the others.
(EQ likely)
We see that minimal pairs can be found that have essentially the same
comparative (differential plus comparative adjective plus than-clause) but
differ as to informativity of background context regarding the than-clause
interval. An uninformative context makes us assume that the interval is
point-like, so that Max> will be well defined and suitable — EQ. If we have
enough background information to be sure that the Max> unit in the than-
clause interval is suitable, we do not panic, make no extra assumptions, and
can get a genuine Max> interpretation as expected.
Things are different with an existential quantifier. Consider (170) against
the same background as before. The minimal than-clause intervals will be the
heights of the individual girls. Max> will be well defined and suitable without
any additional assumptions, and will make this a comparison between John’s
height and the height of the tallest girl, as desired.
x1 x2 x3 x4 x5 J
1:60
Quantifiers in than-clauses
4.1 Summary
1:61
Sigrid Beck
I could alternatively have assumed that the operator Pi from Heim 2006b
shifts the standard adjective meaning to (175).
1:62
Quantifiers in than-clauses
Since Pi on the analysis pursued here always takes scope immediately next to
the adjective, this would have served no particular purpose and I simplified
to (175). But a problem for assuming (175) as the basic meaning of a gradable
adjective is that it is very weak. This creates problems for example for the
negation theory of antonymy (compare e.g. Heim 2006a). (178a) analyses
the negative polar adjective short as the negation of tall. I fail to be able to
imagine how a parallel strategy for the interval based meaning (178b) could
be successful.
So if the intervals do not come into the semantics via a motivated independent
(since mobile) operator Pi, and nor are they plausibly basic, how do they come
in? It would be attractive to say that intervals enter the semantics because,
that is, if and only if, they are needed. That is what I would like to think, and
(175) really was a simplification for the sake of uniformity that I think of as
preliminary.
An idea for how to bring intervals into the semantics when needed that is
due to Heim (2009) is given below. We begin by observing that a relation can
be expressed between a plurality and a part of a scale — a degree ‘blob’.
1:63
Sigrid Beck
c. ∀x ≤ C : ∃y ≤ M : drank(y)(x)&∀y ≤ M : ∃x ≤ C : drank(y)(x)
Note that the notion of degree ‘blobs’ that have a part/whole structure is
anticipated by the reference to covers in Section 3. A cover provides us
with the relevant parts of the degree scale. We are consistently assuming a
mass like structure of the degree scale. To make the connection clear, (1820 )
provides a more complete formalisation of (180a) which includes covers
(compare Beck 2001 for this kind of use for covers).
1:64
Quantifiers in than-clauses
to do in order for this idea to apply to the range of data examined in this
paper? I briefly discuss three issues for which this change in perspective is
reelvant: (i) universal quantifiers, (ii) singular quantifiers, and (iii) maximal
informativity.
First, regarding universal quantifiers: The introduction of intervals analo-
gously to (182) would have to happen with universal quantifiers of various
kinds, in particular universal nominals and intensional verbs (cf. our two
representative examples every girl and predict). Regarding intensional verbs,
there is a proposal by Bošković & Gajewski (2008) that instead of universal
quantification over worlds (183a) they (or at least some of them) involve sum
formation (183b).
1:65
Sigrid Beck
h i
d. λD. ∀x ≤ G : ∃d ≤ D : tall(d)(x)
h i
& ∀d ≤ D : ∃x ≤ G : tall(d)(x)
intervals that contain the heights of all the girls (and nothing
else)
Thus it can be argued that a plural analysis of intervals can capture these
data15 The discussion from Section 3 is (almost — see below) unchanged; what
changes is what happens below the level of AP, so to speak (the predication ‘x
is d-tall’): what we assumed to be basic in (175) is now compositionally derived
via pluralization mechanisms. Next, let’s reconsider data with singular
quantificational elements:
1:66
Quantifiers in than-clauses
4.3 Outlook
Let’s take a step back and think about what an analysis of quantifiers in
than-clauses in terms of selection achieves — beyond the empirical coverage
of the mostly well-known set of data that I have been concerned with above.
Compared to its theoretical competitors, it primarily removes quantifiers
in than-clauses from the realm of scope interaction phenomena. For example,
the interpretive behaviour of quantifiers in than-clauses cannot be seen as
an instance of the Heim/Kennedy generalization (Kennedy 1997; Heim 2001).
The analysis I’ve given in Section 3 violates this generalization.
1:67
Sigrid Beck
1:68
Quantifiers in than-clauses
References
1:69
Sigrid Beck
1:70
Quantifiers in than-clauses
konferanser/SuB12/proceedings/krasikova_337-352.pdf.
Krasikova, Sveta & Ventsislav Zhechev. 2006. You only need a scalar only. Pro-
ceedings of Sinn und Bedeutung 10. URL http://www.sfb441.uni-tuebingen.
de/b10/Pubs/KrasikovaZhechev_SuB05.pdf.
Kratzer, Angelika. 1991. Modality. In von Stechow & Wunderlich (1991),
639–650.
Kratzer, Angelika. 1998. Scope or pseudoscope? are there wide-scope in-
definites? In Susan Rothstein (ed.), Events and grammar. Dordrecht:
Kluwer.
Krifka, Manfred. 1999. At least some determiners aren’t determiners. In Ken
Turner (ed.), The semantics/pragmatics interface from different points of
view (Current Research in the Semantics/Pragmatics Interface 1), 257–291.
Elsevier.
Krifka, Manfred. 2007. Approximate interpretation of number words: A case
for strategic communication. In Gerlof Bouma, Irene Maria Krämer &
Joost Zwarts (eds.), Cognitive foundations of interpretation (Verhandelin-
gen der Koninklijke Nederlandse Akademie van Wetenschappen, Afd.
Letterkunde 190), 111–126. Amsterdam: Royal Netherlands Academy of
Arts and Sciences.
Larson, Richard K. 1988. Scope and comparatives. Linguistics and Philosophy
11(1). 1–26. doi:10.1007/BF00635755.
Link, Godehard. 1983. The logical analysis of plurals and mass terms: A
lattice-theoretical approach. In Rainer Bäuerle, Christoph Schwarze &
Arnim von Stechow (eds.), Meaning, use, and interpretation of language,
Grundlagen der Kommunikation und Kognition, 302–323. de Gruyter.
May, Robert. 1985. Logical form: Its structure and derivation (Linguistic
Inquiry Monographs 12). Cambridge, MA: MIT Press.
Meier, Cécile. 2002. Maximality and minimality in comparatives. Sinn und
Bedeutung 6. 275–287. URL http://www.phil-fak.uni-duesseldorf.de/asw/
gfs/common/procSuB6/pdf/articles/MeierSuB6.pdf.
Partee, Barbara H. 1984. Compositionality. In Fred Landman & Frank Veltman
(eds.), Varieties of formal semantics (Groningen-Amsterdam Studies in
Semantics (GRASS) 3), 281–311. Dordrecht: Foris.
Reinhart, Tanya. 1992. Wh-in-situ: An apparent paradox. Proceedings of the
Amsterdam Colloquium 8. 483–492.
van Rooij, Robert. 2008. Comparatives and quantifiers. Empirical Issues in
Syntax and Semantics 7. 423–444. URL http://www.cssp.cnrs.fr/eiss7/
van-rooij-eiss7.pdf.
1:71
Sigrid Beck
1:72
Semantics & Pragmatics Volume 3, Article 3: 1–41, 2010
doi: 10.3765/sp.3.3
Abstract In this article, I show that there are two kinds of numeral modifiers:
(Class A) those that express the comparison of a certain cardinality with the
value expressed by the numeral and (Class B) those that express a bound
on a degree property. The goal is, first of all, to provide empirical evidence
for this claim and second to account for these data within a framework that
treats modified numerals as degree quantifiers.
1 Introduction
For a long time, there seemed to be agreement in the formal semantic lit-
erature that there was little to be gained from a thorough investigation of
these expressions. An especially dominant view, originating from generalised
quantifier theory (Barwise & Cooper 1981), was that there was not much more
to the semantics of such quantifiers than the expression of the numerical
relations >, <, ≤ and ≥. In the past decade, however, several studies have
shown that this is an overly simplistic assumption. Examples are Hackl 2001,
Krifka 1999 and Takahashi 2006 on comparative quantifiers, Nouwen 2008b
on negative comparative quantifiers, Solt 2007 on differential quantifiers,
Geurts & Nouwen 2007, Umbach 2006, Corblin 2007, Büring 2008 and Krifka
2007b on superlative quantifiers, Corver & Zwarts 2006 on locative quan-
tifiers and Nouwen 2008a on directional quantifiers.1 Such investigations
usually concern the specific quirks of a certain type of modified numeral.
While I believe that it is important to have a semantic analysis of modified
numerals on a case by case basis, I also believe that what is lacking from the
literature so far is a view of to what extent the various modified numerals in
(1) involve the same semantic structures. In this paper, I will attempt to reach
a generalisation along this line by claiming that there are two kinds of modi-
fied numerals: (A) those that relate the numeral to some specific cardinality
and (B) those that place a bound on the cardinality of some property. The
difference will be made clear below. The main example of (A) are comparative
quantifiers like more/fewer than 100. Most other kinds of modified numerals
fall in the second class.
I will start by making clear what distinguishes the two classes of modified
numerals by presenting a body of data that sets them apart. Then, in section
3, I introduce a well-founded decompositional treatment of comparative
quantifiers, proposed by Hackl (2001), which I take to represent the proper
treatment of class A modifiers. In section 4, I propose that class B modifiers
are operators that indicate maxima/minima. I will then account for the
distribution of these quantifiers by arguing that they are often blocked by
unmodified numerals, which are capable of expressing equivalent meanings.
1 See also Nouwen 2010b for an overview.
3:2
Two kinds of modified numerals
This example contrasts strongly with the examples in (3), which are all
unacceptable. (Or, alternatively, one might have the intuition that they are
false).
Why is this so? A naive theory might have it that (2) states that the number
of sides in a hexagon is strictly smaller than 11 (i.e. <11), and that the only
difference with (3) is that, there, it is stated that this number is smaller or
equal to 10 (i.e.≤ 10). Clearly, 6 is both < 11 and ≤ 10. So why are not both
kinds of examples under-informative but true? On the naive view, having at
most 10 sides is expected to be equivalent to having fewer than 11 sides. That
is, both these properties pick out objects with n ∈ {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
sides. Semantically, no contrast is to be expected. Given this semantic
equivalence, a pragmatic explanation of the contrast between (2) and (3)
seems equally unlikely.2
Let us call quantifiers that are acceptable in such examples class A quan-
tifiers and those that are like (3) class B quantifiers. As the contrast between
(4) and (5) shows, the distinction is also visible with lower bound quantifiers.
2 A reviewer wondered whether the naive view could not be maintained if we assume that
there is a pragmatic effect associated to the fact that ≤ n includes the possibility of n while
< n excludes it. It is very much unclear what kind of effect that would be, however. One
could, for instance, base a pragmatic inference on the fact that, in (3a), the speaker seems to
signal the possibility that a hexagon has 10 sides by using at most 10. However, one could
equally argue that the same signal is given by the speaker of (2), simply by using fewer than
11 instead of fewer than 10.
3:3
R.W.F. Nouwen
That is, (4) is under-informative, yet true and acceptable, while the examples
in (5) are unacceptable/false.
Or, if your computer has a mere 512MB of memory, I can boast that:
In these examples, I am comparing the definite amount of 1GB, i.e. the precise
amount of memory I know my laptop has, to some given contrasting amount
2GB (512MB) by means of less than (more than). This is something class A
quantifiers can do very well, but something that is unavailable for class B
modified numerals:
(9) a. Computers of this kind have {at most / maximally / up to} 2GB of
memory.
b. Computers of this kind have {at least / minimally} 512MB of mem-
ory.
3:4
Two kinds of modified numerals
We normally interpret (10) to indicate that the speaker does not know how
many people Jasper invited. That is, it is unacceptable for a speaker to utter
(10) if s/he has a definite amount in mind, which is why the addition of 43, to
be precise in (11) is infelicitous.4
By assuming that the speaker does not know the exact amount, (10) is
interpreted as being about the range of values possible from the speaker’s
perspective. The speaker thus states that there is a bound on that range.
The same intuition occurs if we substitute maximally 50 by any other class B
quantifier.
In sum, I showed that the landscape of modified numerals can be divided
into two separate classes of expressions. What distinguishes class B quanti-
fiers from other modified numerals is that they are incompatible with definite
amounts and are always interpreted with respect to a range of values. Below,
I will present a semantics of class B expressions that makes this intuition
3 In his comments on this article, David Beaver pointed out examples like (i), where the number
appears to be a variable quantified over.
Although I will not attempt a compositional analysis of cases like (i), such examples do
appear to support the main intuition that class B quantifiers express relations between
amounts and ranges. An example like (i) states that 50 is the maximum of the range formed
by the different number of people present at different times. This is different from (ii),
which states that at any time the number of people present did not exceed 50. (This is true,
for instance, in case from start to finish there were always 20 people present.) So while (i)
expresses a maximum on a range of values created by quantification, (ii) quantifies over
different times and compares the number of people present at that time with 50.
(ii) There were fewer than 50 people there at any one time.
(i) Jasper invited fewer than 50 people to his party. 43, to be precise.
3:5
R.W.F. Nouwen
precise. Before I can do so, however, I will need to discuss the semantics of
A-type numeral modifiers.
(12) more than 10 = λP .λQ. ∃x[#x > 10 & P (x) & Q(x)]
fewer than 10 = λP .λQ.¬∃x[#x ≥ 10 & P (x) & Q(x)]
In the past decade it has become clear that it is important to have a closer
look at these modified numerals (Krifka 1999; Hackl 2001). In what follows,
I will assume the following semantics of fewer than, which is based on the
arguments in Hackl 2001.
The workings of this definition will become clear below, but one of the main
motivations for an analysis along this line can be pointed out immediately.
The semantics in (13) is simply that of a comparative construction, where car-
dinalities are seen as a special kind of degrees. That is, like the comparative,
it involves a degree predicate M and a maximality operator that applies to
5 In a set-theoretic approach (12) would correspond to the perhaps more familiar (i). I discuss
(12) rather than (i) since, in what follows, I will assume a framework that makes use of
sum individuals. It is easy to see that, within their own respective frameworks, (12) and (i)
ultimately yield the same truth-conditions.
3:6
Two kinds of modified numerals
this predicate (Heim 2000). In other words, (13) is completely parallel to other
comparatives, like (14). While in (13), M is a predicate like being a number n
such that Jasper invited n people to his party, in (14) M could, for instance,
be filled in with something like being a degree d such that Jasper is tall to
degree d.
This leads to the following interpretation, which results in the desired simple
truth-conditions.
(19) [λM.maxn (M(n)) < 10] ( λn.∃x[#x = n & sushi(x) & ate(j, x)])
=β
maxn (∃x[#x = n & sushi(x) & ate(j, x)]) < 10
This might seem like a rather elaborate way of deriving the truth-conditions
for such simple sentences. Using (12), we would have derived as truth-
3:7
R.W.F. Nouwen
conditions ¬∃x[#x ≥ 10 & sushi(x) & ate(j, x)], which is equivalent to (19),
but which does not require resorting to (moving) degree quantifiers and
silent counting quantifiers. Importantly, however, Hackl’s theory makes some
crucial predictions which are not made by theories assuming a semantics as
in (12).
If, like degree operators, modified numeral operators can take scope,
we expect to find scope alternations that resemble those found with degree
operators (Heim 2000). As Hackl observed, this prediction is borne out. For
reasons explained in Heim 2000, structural ambiguity arising from degree
quantifiers and intensional operators like modals is only visible with non-
upward entailing quantifiers, which is why all the following examples are
with upper-bounded modified numerals.
The example in (20), for instance is ambiguous, with (20a) and (20b) as its
two readings.
(20) (Bill has to read 6 books.) John is required to read fewer than 6 books.
One of the readings of (20) states that there is an upper bound on what John
is allowed to read. The more natural interpretation, however, is a minimality
reading, which is about the minimal number of books John is required to
read. (That is, (20) would, for instance, be true if John meets the requirements
as soon as he reads 3 or more books.)
Following Heim (2000), Hackl analyses this ambiguity as resulting from
alternative scope orderings of the modal and the comparative quantifier. The
upper bound reading, (20a), corresponds to a logical form where the modal
takes wide scope. The minimality reading involves the maximality operator
intrinsic to the comparative construction taking wide scope over the modal
(Heim 2000).
3:8
Two kinds of modified numerals
which is very weak, stating simply that values below the numeral are within
what is permitted, without stating anything about the permissions for higher
values. (That is, the reading intended in (23b) is, for instance, verified by a
situation where there are no restrictions whatsoever on what John is allowed
to read. Clearly, (23a) would be false in such a situation.)
As before, these readings can be predicted to exist on the basis of the relative
scope of modal and comparative quantifiers.
The reader may check that Hackl’s predicted readings in (24) and (25) are
indeed the attested ones.
3:9
R.W.F. Nouwen
These observations add to the data separating class A from class B quanti-
fiers. Summarising, the distinctions are then as follows. First of all, class B
quantifiers, but not class A quantifiers, resist definite amounts, except when
embedded under an existential modal. Second, class B quantifiers, but not
class A quantifiers, resist weak readings when embedded under an existential
modal.
In the next section I will argue that the peculiarities of class B quantifiers
can be explained if we assume that they are quite simply maxima and minima
indicators. Basically, what I propose is that the semantics of maximally
(minimally) is simply the operator maxd (mind ). This might be perceived as
stating the obvious. What is not obvious, however, is how such a proposal
accounts for the difference between class A and class B quantifiers. I will
argue that the limited distribution of class B modifiers is due to the fact that
they give rise to readings that are in competition with readings available
for non-modified structures. I will show that, in many circumstances, the
application of a class B modifier to a numeral yields an interpretation which
is equivalent to one that was already available for the bare numeral. Before I
can explain the proposal in detail, I therefore need to include an account of
bare numerals in the framework.
3:10
Two kinds of modified numerals
I assume that, like the meaning in (31a), the meaning in (31b) is semantic and
not the result of a scalar implicature that results from (31a). See e.g. Geurts
2006 for a detailed ambiguity account, and for some compelling arguments
in favour of it.6
In the current framework, that of Hackl 2001, the weak reading in (31a) is
due to a weak semantics for the counting quantifier: i.e. many1 . I propose
that the strong reading, (31b), is accounted for by an alternative quantifier
many2 (taking inspiration from Geurts 2006.)7
3:11
R.W.F. Nouwen
Not only does the option of two counting quantifiers, many1 and many2 ,
suffice to account for the ambiguity of bare numerals, it is moreover harmless
with respect to the semantics of comparative quantifiers. A sentence like
Jasper read more than 10 books is not ambiguous. It is important to show
that the availability of two distinct counting quantifiers does not predict
ambiguities in such examples. It will be instructive to see in somewhat more
detail why this is indeed the case.
The structure in (34) is exemplary of a simple sentence with a modified
numeral object. As explained earlier, the modified numeral applies to the
degree predicate that is created by moving the quantifier out of the DP.
Now that there is a choice between two counting quantifiers, the denotation
of the degree predicate depends on which of many1 and many2 is chosen. The
predicate in (35) is the result of a structure containing many1 ; the predicate
in (36) is based on many2 . If, in the actual world, Jasper read 10 books, then
(35) denotes {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}. When, however, the predicate contains
the many2 quantifier, the denotation is a singleton set: {10} if Jasper reads
10 books. This is because only the maximal group of books read by Jasper
is such that it is the unique group of that kind of a certain cardinality.
In general, the many2 -based degree predicate extension is a singleton set
containing the maximum of the values in the denotation of the many1 -based
degree predicate.
3:12
Two kinds of modified numerals
Given that the relation between (38) and (37) is once again one of a set and its
maximal value, no ambiguities can be expected to arise when comparative
quantifiers are applied to these two predicates. This is as is desired.
Of course, it could be that the actual situation is not one containing a
specific requirement, but one with for instance a minimality requirement.
Say, for instance, Jasper has to read at least 4 books. In that case, (37) denotes
the set {1, 2, 3, 4}. The extension of (38), however, is the empty set. (In such a
context, there is no specific n such that Jasper has to read exactly n books.)
Clearly, the maximal value for the predicate is undefined in such a case.
This means that the logical form based on many2 will not lead to a sensible
interpretation and, so, we again do not expect to find ambiguity.
The case of predicates that are formed by abstracting over an existential
modal operator is illustrated in (39) and (40). If Jasper is allowed to read a
maximum of 10 books, then the two predicates are equivalent, both denoting
the set {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}.9
In sum, the option of two counting quantifiers many1 and many2 is irrelevant
when combined with a comparative quantifier. This is because the compara-
9 If there is in addition a lower bound, the two predicates are no longer equivalent, but their
maximum will be.
3:13
R.W.F. Nouwen
In the formula in (41), MOD↓B generalises over any of the class B modifiers at
most, maximally, up to, etc.10
It goes beyond the scope of this article to implement a formal connection between (ii) and
(41), but it should be clear that the underlying mechanism is the same.
3:14
Two kinds of modified numerals
part of Grice’s maxim of Manner (Grice 1975), steers toward minimising the
form used to express something. This causes simple (unmarked) meanings to
be typically expressed by means of simple (unmarked) forms. Marked forms
which by convention could be given the same unmarked meaning as some
unmarked form are instead given a more marked interpretation. There are
many variations and implementations of this idea (McCawley 1978; Atlas &
Levinson 1981; Blutner 2000; van Rooij 2004),11 but what is most relevant for
this paper is the general idea that an unmarked meaning is blocked as an
interpretation for the marked form.
With this in mind, the equivalence of (42a) and (42b) whenever M denotes
a singleton set has profound consequences for when it actually makes sense
to state that the maximum of a degree predicate equals a certain value. That
is, in cases where (42a) equals (42b), we expect that the use of maximally
does not lead to an interpretation based solely on (42a), since the use of the
bare numeral form would result in the same meaning. To illustrate this in
some more detail let us carefully go through the following examples.
We know from the discussion above that one of the interpretations avail-
able for (43) is (44).
The interpretations in (46) and (47) are equivalent. In fact, just like we do
not expect ambiguities to arise with comparative quantifiers on the basis
of the many1 /many2 choice, we do not expect any ambiguities to arise with
MOD↓B quantifiers, for the simple reason that both such operators involve
11 In fact, there is a close resemblance between this prevalent idea in pragmatics and blocking
principles in other parts of linguistics. The commonality is that two different expressions
cannot have identical meanings. See, for instance, the Elsewhere Condition (Kiparsky 1973)
in phonology or the Avoid Synonymy principle (Kiparsky 1983) in morphology.
3:15
R.W.F. Nouwen
Importantly, the single reading of (45) is equivalent to (44), the strong reading
of (43). The example in (43), however, reaches this interpretation by means
of a much simpler linguistic form, one which does not involve a numeral
modifier. I propose that this is why the reading in (48) of (45) does not
surface: it is blocked by (43).12
As observed above, we can nevertheless make sense of (45) once we
interpret the sentence to be about what the speaker holds possible. So, a
further possible reading for (45) is that in (49).
3:16
Two kinds of modified numerals
In other words, the meaning in (49) for (45) is not blocked by the bare numeral
form in (43) since (43) lacks this reading.
To be sure, I do not claim that (50) would be an available reading for (43).
That is, the particular kind of interpretation that examples like (45) receive
is available only as a last resort strategy. Underlying this analysis is the as-
sumption that there exist silent modal operators. I can offer no independent
evidence for this assumption, but stress that the intuitions regarding exam-
ples like (45) quite clearly point into the direction of some sort of speaker
modality. In work on superlative quantifiers, we find some alternatives to
the present account. Such approaches are meant to deal with at most and at
least only, but if my arguments above are on the right track, then we could
reinterpret these proposals for the semantics of superlative quantifiers as
applying to the whole of class B. For instance, the analysis of class B expres-
sions presented here differs from that of superlative modifiers in Geurts &
Nouwen 2007. According to the present proposal, the modal flavour of (45) is
due to a silent existential modal operator. In Geurts & Nouwen, however, the
modal was taken to be part of the lexical content of superlative quantifiers.
Another alternative, proposed for superlative modifiers in Krifka 2007b and
which is closer to the present proposal, is to analyse examples like (45) not as
involving a modal operator, but rather a speech act predicate, like assert. In
that framework, the analysis of (45) would say that n=10 is the maximal value
for which ∃(!)x[#x = n & people(x) & invite(j, x)] is assertable, rather than
possible.13 That is, according to Krifka, (45) is interpreted by assigning the
modified numeral scope over an illocutionary force operator, rather than
over a modal operator.
I will return to a comparison of these approaches below. I would like to
point out immediately, however, what I think are the major disadvantages of
both alternatives. The main problem is with examples like (51), which contain
an overt existential modal.
(i) I know how many people were at the party, but I’ve been told not to reveal that
number to the press. However, there were maximally 50 there.
It would be interesting to see if data like these help in reaching a synthesis of Krifka’s
account and the present proposal.
3:17
R.W.F. Nouwen
Its most salient reading is one in which 10 is said to be the maximum number
of people Jasper is allowed to invite. That is, it places an upper bound on
what is allowed. For Krifka, this is problematic since, here, the modified
numeral is quite obviously not a speech act operator. For the proposal in
Geurts & Nouwen 2007, such examples are problematic since the modal
lexical semantics of at most predicts a reading with a double modal operator,
one originating from the verb and one from the numeral modifier. To remedy
this, Geurts and Nouwen provide an essentially non-compositional analysis
of such examples as modal concord.14
In contrast, the current proposal deals effortlessly with examples, such
as (51). What was crucial to my explanation of how (45) gets to be interpreted
is that degree predicates based on modals with existential force denote
non-singleton sets even when the counting quantifier associated with the
numeral is many2 . This entails that saying that the maximum value for such
a predicate is n is not equivalent to saying that the predicate holds for n.
More formally, there is a contrast between (52a) and (52b).
3:18
Two kinds of modified numerals
If maximally 10 is taken to have wide scope over the modal, then we arrive
at (54a), the reading that says that the maximum number of people Jasper
is allowed to invite equals 10. This is not a semantic interpretation that is
available for (55). Its many2 reading, for instance, says that inviting exactly
10 people is something that Jasper is allowed to do. This is much weaker
than (54a). (The only way we can arrive at an equally strong reading for (55)
is by means of implicature.)
If we take the modal in (51) to have widest scope, as in (54b), the resulting
interpretation is one in which inviting exactly 10 people is allowed for Jasper.
This is the reading for (55) discussed above, and so it is blocked. As a result,
(54a) is the only interpretation available.
An interesting side to the account presented here is that the upper bound
class B quantifiers do not encode the ≤ relation. As maxima indicators, their
application only makes sense if what they apply to denotes a range of values.
Otherwise, using the strong reading of the bare numeral form will do just as
well.
Interestingly, the approach also predicts that some of the examples I
discussed above do not only result in a blocking effect, but could moreover
be predicted to be false. For instance, according to the approach set out
above, the meaning of (56a) is that in (56b).
15 As far as I can see, assertability would have the same (crucially weak) properties as possibility.
So, should a silent speech act predicate seem more plausible than a silent modal operator,
then ♦ can just as well be interpreted as expressing assertability. It appears that such a
move would be largely compatible with the proposal of Krifka 2007b.
3:19
R.W.F. Nouwen
The reading in (56b) is not only blocked by A triangle has 10 sides, but
it is moreover plainly false. I believe that this predicts that (56a) should
be expected to have a somewhat different status from (57), which strictly
speaking has a true interpretation, but one that can be expressed by simpler
means.
Note first that minimality operators are sensitive to the many1 / many2
distinction. Consider the degree predicate [λd. John read d many1/2 books]
and, say, that John read 10 books. In the many1 version of the logical form,
the minimal degree equals 1. In fact, independent of how many books John
read, as long as he read books, the minimal degree will always be 1. In the
many2 version of the logical form, the predicate denotes a singleton set, {10}
if John read 10 books. The minimal degree in that case is, of course, 10.
These observations already straightforwardly account for our intuitions
for an example like (60).
The many1 interpretation of (60) will be rejected, for it will always be false.
The minimal value for any simple many1 -based degree predicate is always 1.
The many2 interpretation of (60) will be rejected too, for it will correspond
to an interpretation saying that John read (exactly) 10 books. This reading is
3:20
Two kinds of modified numerals
blocked by the bare numeral. (In fact, (60) in the many2 variant is equivalent
to John read maximally 10 books, which, as was explained above, is blocked
for the same reasons.)
We can save (60) by interpreting it with respect to an existential modal
operator. This yields two readings:
The form in (61a) is once more a contradiction: the minimal degree for which
it is deemed possible that John read d-many1 books is always 1. The reading
in (61b) is much more informative. It says that that the minimal number for
which it is thought possible that John read exactly so many books is 10. In
other words, this says that it is regarded as impossible that John read fewer
than 10 books. This is exactly the reading that is available.
Some words are in order on the interaction of numeral modifiers with non-
modal operators. Given the current proposal, any property that involves
existential quantification would license the use of a class B modifier. However,
it is known that degree operators (which we take modified numerals to be)
cannot move to take scope over nominal quantifiers (cf. Kennedy 1997; Heim
2000).16 This explains why (62) does not have the reading in (63).
3:21
R.W.F. Nouwen
(66) min> í:
The minimum n such that Jasper should read n books is 10
a. minn (í∃x[#x = n & book(x) & read(j, x)]) = 10 many1
b. minn (í∃!x[#x = n & book(x) & read(j, x)]) = 10 many2
It turns out that none of these logical forms provide a reading that is
in accordance to our intuitions regarding (64). First of all, notice that
minn (∃x[#x = n & book(x) & read(j, x)]) = 10 is a contradiction. If there
are 10 books that Jasper read, then there is also a singleton group containing
a book Jasper read. The minimum number of books Jasper read is therefore
either 1 (in case he read something) or 0 (in case he did not read anything).
It could never be 10. Consequently, (65a) is a contradiction. For a similar
reason, (66a) is a contradiction too. If there needs to be a group of 10 books
17 In this paper, I ignore readings which (for the case of at least) Büring (2008) calls speaker
insecurity readings and which Geurts & Nouwen (2007) discuss extensively. Basically, this
reading amounts to interpreting the modal statement with respect to speaker’s knowledge.
Such readings are especially prominent with superlative quantifiers. For instance, the speaker
insecurity reading of Jasper should read at least 10 books is: the speaker knows that there is
a lower bound on the number of books that Jasper should read, s/he does not know what
that lower bound is, but she does know that it exceeds 9.
Furthermore, I also ignore a reading of (64) in which 10 books is construed as a specific
indefinite. In that reading, (64) states that there are 10 specific books such that only if Jasper
reads these books will he comply with what is minimally required.
3:22
Two kinds of modified numerals
read by Jasper, then there also need to exist groups containing just a single
book read by Jasper. Once again, the minimum number referred to in (66a)
is either 0 or 1, never 10.
Turning to (65b), notice that the minn -operator is vacuous here, since
there is just a single n such that Jasper read exactly n books. This renders
(65b) equivalent to the many2 reading of Jasper should read 10 books, and
so we predict it to be blocked. The interpretation in (66b) does not fare any
better. In fact, the minn -operator is vacuous here as well. This means that
(65b) is equivalent to (66b) and that it is consequently also blocked. Even if
no blocking were to take place, (65b)/(66b) offer the wrong interpretation
anyway. They state that Jasper must read exactly 10 books (no more, no
fewer), which is not what (64) means.
One might think that the problems with (65b) and (66b) can be remedied
by abandoning quantification over sums and instead using reference to
(maximal) sums. For instance, (67) represents the truth-conditions we are
after. (Here σx returns the maximal sum that when assigned to x verifies the
scope of σ ).
Still, here too the application of minn is not meaningful, since there is only
a single n such that í[#σx (book(x) & read(j, x)) ≥ n] holds, which is 10
if (64) is true. As a consequence, it would not matter whether we applied
a maximality or a minimality operator. We then wrongly predict that (68)
should share a reading with (64). (Note that (65b) and (66b) suffer from the
same odd prediction, given that the operator minn has no semantic impact
there either.)
It appears then that the proposal defended in this article fails hopelessly on
sentences like (64). As I will show, however, things are not so dire as they
appear. In fact, I will argue that what we stumble upon here is a general,
but poorly understood property of modals, which could be summarised as
follows:
3:23
R.W.F. Nouwen
I will not offer an explanation for this generalisation (but see Nouwen 2010a
for an attempt). I will simply show that if we look a bit closer at the inter-
pretation of modal operators, then we come to understand that my theory
actually yields a welcome analysis.
3:24
Two kinds of modified numerals
The approach of Geurts and Nouwen is the most broadly applicable approach
to superlative quantifiers in the (admittedly small body of) literature on that
topic. There are alternatives on the market, but they do not handle examples
like these very well. As I mentioned above, Krifka (2007b) takes at least to
be a speech act modifier. Basically, an example like (71) is analysed by Krifka
in terms of what the speaker finds assertable and is paraphrased as follows:
the lowest n such that it is assertable that John read n books is 10. When
at least is embedded in an intensional context, however, it does not modify
the strength of assertability, but rather the intensional operator. So, taking
Krifka’s analysis as suitable not just for superlative, but rather for all class B
quantifiers, (72a) would be paraphrased as (73).
(73) 10 is the smallest value for n such that John should read n books
In such cases, Krifka’s analysis is identical to the one I have set out above
and it runs in exactly the same problem: (73) is not the reading we are after.
Rather, (72a) means that 10 is the smallest number of books John is allowed
to read.
Geurts & Nouwen (2007) and Krifka (2007b) say nothing about the distinc-
tion between class A and class B expressions. However, if we extend their
proposals for superlative quantifiers to cover all B-type quantifiers, then we
have an interesting trio of competing characterisations of such expressions.
At face value, the observations made so far in this section would appear to
speak in favour of the modal concord proposal of Geurts & Nouwen (2007)
(generalised to all class B quantifiers) and against the account defended here
or in Krifka 2007b. As I will argue now, however, there are reasons to believe
that the problematic predictions made by the latter two theories are not due
to the semantics of the modified numeral, but are actually the result of an
overly simplistic understanding of requirements. What I will do is discuss in
some detail examples like (74).
(74) The minimum number of books John needs to read to please his
mother is 10.
3:25
R.W.F. Nouwen
Note, secondly, that (74) spells out the semantics I have proposed for (75).
What I will show now is that when we look into the semantic details of (74),
we will run into exactly the same problems as we did for (75). What this
shows is that rather than thinking that my account of class B quantifiers is
on the wrong track, there are actually reasons to believe that the proposal
lays bare a hitherto unexplored problem for the semantics of modals like
need, require, etc.
Let us consider the semantics of (74). Say that, in fact, the minimal
requirements for pleasing John’s mother are indeed John reading 10 books.
That is, if John reads 10 or more books, she is happy. If he reads fewer,
she will not be pleased. Standard accounts of goal-directed modality (von
Fintel & Iatridou 2005) assume that statements of the form to q, need to p
are true if and only if p holds in all worlds in which the goal q holds. Below, I
refer to the worlds in which John pleases his mother as the goal worlds. It is
instructive to see what we know about the propositions that are true in such
worlds. The following is consistent with the context described above.
(76) a. In all goal worlds: ∃x[#x = 10 & book(x) & read(j, x)]
b. In all goal worlds: ∃x[#x = 9 & book(x) & read(j, x)]
c. In all goal worlds: ∃x[#x = 1 & book(x) & read(j, x)]
d. In some (not all) goal worlds: ∃x[#x = 11 & book(x) & read(j, x)]
e. In some (not all) goal worlds: ∃x[#x = 12 & book(x) & read(j, x)]
f. In no goal world: ¬∃x[book(x) & read(j, x)]
Let us now analyse some examples. First of all, (77a) and (77b) are intuitively
true and are also predicted to be true ((77a) by virtue of (76a) and (77b) by
virtue of (76c).)
The example in (78) is intuitively false, and is also predicted to be false, for
the context is such that there are goal worlds in which John reads only 10,
and not 11, books.
3:26
Two kinds of modified numerals
virtue of (76c)).
(79) The minimum number of books John needs to read, to please his
mother, is 1.
(80) minn [In all goal worlds: ∃x[#x = n & book(x) & read(j, x)]] = 1
In general, theories such as that of von Fintel & Iatridou (2005) predict that
if S is an entailment scale of propositions, and p is a proposition on this
scale, then if p is a minimal requirement for some goal proposition q, then
a statement of the form “the minimum requirement to q is p” is always
predicted to be false, except when p is the minimal proposition of S. This
makes a devastating prediction, namely that minimal requirements could
never be expressed, since they would always correspond to the absolute
minimum.
One might think that what is going wrong in the example above is that I
assume that when we talk about how many books John read we should be
talking about existential sentences, that is about at least how many books
John read. The alternative would be to describe the number of books John
read by means of the counting quantifier many 2 , that is, how many books
John read exactly. I’m afraid this only makes the problem worse. Here is a
description of the relevant context in terms of the exact number of books
that were read by John.
(81) a. In some but not all goal worlds: John read exactly 10 books.
b. In no goal world: John read exactly 9 books.
c. In no goal world: John read exactly 1 book.
d. In some but not all goal worlds: John read exactly 11 books.
e. In some but not all goal worlds: John read exactly 12 books.
Now, there is no number n such that John read exactly n books in all goal
worlds. So, the smallest number of books John needs to read does not refer.
The upshot is that there is no satisfactory analysis of examples like (74)
under the assumptions made here. In general, it seems that, under standard
assumptions, there is no satisfactory analysis of minimal requirements.
Whatever way we find to fix the semantics of cases like (74), however, this fix
will work to save the account of class B quantifiers too, for (74) was a literal
spell-out of the proposed interpretation of similar sentences with at least,
minimally, etc. It goes beyond the scope of this article to provide such a fix.
The overview in (81), however, can help to indicate where we should look for
3:27
R.W.F. Nouwen
a solution.20 Given that there is no goal world in which John read exactly
n books for n’s smaller than 10, it follows that 10 is the minimal number
of books John could read to please his mother. In other words, examples
like (74) show that, in the scope of a minimality operator, modals that are
lexically universal quantifiers get a weaker interpretation.
That said, it is time to revisit example (64), repeated here as (82).
3:28
Two kinds of modified numerals
In the context of the question asked in (86), the imperative does not convey
that to comply with the advice, the hearer has to stop buying cigarettes.
Instead, it is interpreted as stating that one of the things one could do to
save money is to stop buying cigarettes. Thus, examples like these display
a mechanism that is similar to the interaction of numeral modifiers and
modality.
The mysterious interaction of modified numerals and modals is moreover
reminiscent of the interaction of modals and disjunction (Zimmermann 2000;
Geurts 2005; Aloni 2007), especially since, on an intuitive level at least,
a class B modified numeral like minimally 10 (and, quite obviously, 10 or
more) appears to correspond to a disjunction of alternative cardinalities,
with 10 as the minimal disjunct.21 A central issue in the literature on modals
and disjunction is that classical semantic assumptions fail to capture the
entailments of sentences where a disjunctive statement is embedded under a
modal operator (Kamp 1973). A detailed comparison of this complex issue
with the discussion of minimal requirements that I presented here, however,
will be left to further research.
In this section, I will attempt to give some initial answers to three empirical
questions concerning the distinction between class A and B modified nu-
merals that is central to this article. First of all, I turn to the issue of which
expressions go with which class. So far, I have restricted my attention mostly
to, on the one hand, comparative quantifiers (as proto-typical class A expres-
sions) and, on the other hand, superlative, minimality/maximality and up
to-modified numerals (as representatives of class B). What about expressions
like the prepositional over n or under n or the double bound between n and
m or from n to m? Below, I will turn briefly to such expressions.
A second empirical question concerns the validity of the examples used
so far. Although I believe that the intuitions concerning the constructed
examples in this article are rather clear, my plea for two kinds of modified
numerals would still benefit from some independent objective support.
Below, I present the results of a small corpus study that clearly reflects the
distinction argued for in this article.
Finally, this section will turn to the cross-linguistic generality of the
21 See Nilsen 2007 and Büring 2008 for suggestions along this line for the modifier at least
only.
3:29
R.W.F. Nouwen
proposal. I will provide data from a more or less random set of languages that
suggest that the class A/B distinction is not a quirk of English or Germanic,
or even Indo-European, but is, in fact, quite general.
I will leave it an open question exactly which quantifiers belong to which class.
Nevertheless, I can already offer some speculations on several quantifiers
that I have so far not discussed. To start with disjunctive quantifiers, it
appears that these are clear cases of class B expressions.
With disjunctive quantifiers in class B, one might wonder whether there are
any examples of class A expressions which are not the familiar comparative
quantifiers more/fewer/less than n. I think that locative prepositional modi-
fiers are a likely candidate for class A membership, however. In fact, I believe
that the locative/directional distinction in spatial prepositions corresponds
to the class A/B distinction when these prepositions are used as numeral
modifiers.
Roughly, locative prepositions express the location of an object and are
compatible with the absence of directionality or motion. Directional prepo-
sitions, on the other hand, cannot be used as mere indicators of location.
(88) Locative:
a. John was standing under a tree.
b. That cloud is hanging over San Francisco.
c. Breukelen is located between Utrecht and Amsterdam.
(89) Directional:
a. #John was standing up to here.
b. #John was standing from here.
c. #Breukelen is located from Utrecht to Amsterdam.
3:30
Two kinds of modified numerals
The example in (90b) is somewhat strange, since it claims that the most
expensive car you can buy is €1000. The example in (89a), in contrast, makes
no such claim. It clearly has a weak reading: there are cars that are cheaper
than €1000 and there might be more expensive ones too. As explained above,
such weak readings are typical for class A quantifiers and do not occur with
class B quantifiers.22 Furthermore, under seems perfectly compatible with
definite amounts, such as in (91).
(91) The total number of guests is under 100. To be precise, it’s 87.
(92) The total number of guests is between 100 and 150. It’s 122.
(93) #The ticket to the Stevie Wonder concert that I bought yesterday cost
from €100 to €800.
(94) Tickets to the Stevie Wonder concert cost from €100 to €800.
It appears then that locative prepositions turn into class A modifiers, while
directional ones turn into class B modifiers. A potential counterexample,
however, is over, which apart from a (relatively rarely used) locative sense, as
in (88b), has a directional sense, such as exemplified in (95).
22 An anonymous reviewer notes a complication. It appears that under cannot take wide scope
with respect to a modal. That is, it fails to display scope ambiguities such as the one in (20)
above. For instance, (i) (which is an example given by the reviewer) is odd, since it misses an
interpretation where the modified numeral has scope over require.
3:31
R.W.F. Nouwen
A potential explanation for why the numeral modifier over lacks a direc-
tional/class B sense23 is that the use of prepositions in numeral quantifiers is
restricted to prepositions that are vertically oriented. This is connected to
the observation of Lakoff & Johnson 1980 that cardinality is metaphorically
vertical: more is higher (as in a high number), less is lower (as in a low
number). Prepositions in modified numerals follow this metaphor.24 What
is interesting about over, however, is that only its locative sense is vertical.
Its directional sense, as in (95), rather expresses a mainly horizontal motion.
This could explain why there is no class B sense numeral modifier over.
Further clues that this analysis is on the right track come from Dutch,
where the preposition over lacks a locative sense.
Instead of over in (98), boven (above) should be used for locative meanings.
In Dutch, only boven can modify numerals. Over, which lacks a vertical sense,
is unacceptable in modified numerals.
3:32
Two kinds of modified numerals
(102) Class A
(Positive:) more than —, over —
(Negative:) fewer than —, less than —, under —
(Neutral:) between — and —
(103) Class B
(Positive:) at least —, minimally —, from — (up), — or more
(Negative:) at most —, maximally — , up to —, — or fewer, — or
less
(Neutral:) from — and —
Missing from this classification are the negative comparative quantifiers like
no more/fewer than 10. The reason for this is that the occurrence of negation
complicates the comparison with other quantifiers. In fact, I think that such
quantifiers are best treated as the compositional combination of a class A
comparative modifier with a negative differential no. See Nouwen 2008b for
the consequences of such a move and for more details on the interpretations
available for sentences containing such quantifiers.
I now turn to a small corpus study I conducted which supports the division
between class A and class B modifiers. Recall that one of the central obser-
vations in favour of the distinction connected to contrasts such as (104).
Whereas (104a) can be interpreted with respect to a definite actual number
of people invited by Jasper, (104b) does not allow such an interpretation and
instead is evaluated in relation to what the speaker holds possible.
3:33
R.W.F. Nouwen
6.2.1 Method
I used the free service for searching the Corpus of Contemporary American
English (COCA, 385 million words, a mix of fiction, science, newspaper and
entertainment texts and spoken word transcripts) at americancorpus.org
(Davies 2008). For each numeral modifier I took 100 quasi-random25 occur-
rences of the modifier with a numeral. For each of these cases I examined
whether the modified numeral was in the scope of an explicit existential
modal operator (such as can, could, might, possibly, allow, etc.) In other
words, I only looked at the surface form and only counted the number of
cases where a modal expression has a scope relation with a modified numeral.
Given the theory presented in this article, the prediction is that this number
is significantly higher with class B numerals than with class A expressions.
I compared five modifiers: fewer than, under, between, at most and up
to. Not all occurrences of these modifiers with a numeral in the corpus were
taken into consideration. For instance, (105) was ignored because in this
example up to is probably not a constituent.26 That is, this example contains
the particle verb to lift up, rather than the verb to lift.
3:34
Two kinds of modified numerals
6.2.2 Results
The results, summarised in the table in (106), support the proposal in this
article. Here, P is the percentage of occurrences within a existential modal
context, within a sample of 100 occurrences of that modifer.27
The corpus thus shows a clear preference for combining class B quantifiers
with existential modal operators, as was predicted.28 Whether the data are as
clear as (106) for other expressions too remains to be seen. It will be difficult
to extend this type of study to other modifiers. Maximally and from. . . to,
for instance, were included in the present corpus search, but did not yield
enough occurrences to make a meaningful comparison.
The class A/B distinction is not a peculiarity of the English language. I will
suggest in this subsection that, in fact, the distinction is quite general and
that languages seem to fill in the two classes in roughly the same way. Dutch,
for instance, mirrors the English data perfectly. To illustrate, (107) and (108)
shows the A/B distinction in a contrast between comparative and superlative
quantifiers.
There are similar contrasts for other numeral modifiers. In a nutshell, the
Dutch data suggests the two classes in (109), which is parallel to English.
3:35
R.W.F. Nouwen
In other languages, we find similar data. For instance, the division between
comparative and superlative modifiers appears to be cross-linguistically quite
general. In Italian, for instance, the following contrast exists.
In Chinese, there also exists a superlative form that behaves like a class B
modifier.
On the other hand, there also exists an alternative form resembling English
at least, which behaves differently. The form zhi-shao can be used as in a
similar way as English at least is in sentences like At least it doesn’t rain!.
Despite this parallel to the English superlative modifiers, the example in (114)
appears to be fine, which suggests zhi-shao is of type A.
I leave a more detailed investigation of such data for further research. What-
ever the outcome, however, the data first and foremost reveal that the type
of contrasts that have been the central focus of this paper occur in Chinese
and that, thereby, Chinese also appears to have the class A/B distinction.
3:36
Two kinds of modified numerals
Similar data exist for German bis (zu), Hebrew ’ad, Catalan fins a, Spanish
hasta and Italian fino a. In fact, in Italian it appears that (120) is generally
awkward, resisting a reading that connects to speaker’s possibility. However,
it becomes acceptable if an overt modal verb is inserted.
3:37
R.W.F. Nouwen
7 Conclusion
The central aim of this article has been to put forward the empirical ob-
servation that numeral modifiers come in two classes: those that relate to
definite amounts (class A) and those that resist association with definite
cardinality (class B). Theoretically, I proposed that underlying this distinction
is a difference in the kind of relations numeral modifiers encode: either a
simple comparison relation between numbers (class A) or a relation between
a range of values and its minimum or maximum (class B). I furthermore
showed how this theory can be implemented in a framework where numeral
modifiers are treated as degree quantifiers.
While there already existed analyses of both type A and type B modifiers,
the class difference that was the central focus of this article has not yet been
discussed. For the treatment of class A quantifiers in this article I adopted
the proposal of Hackl 2001. My account of class B modifiers, on the other
hand, is original. It can be compared to two closely related proposals on the
semantics of superlative modifiers: Geurts & Nouwen 2007, where superlative
modified numerals are proposed to lexically specify modal operators, and
Krifka 2007b, where superlative quantifiers are proposed to be speech act
modifiers. Both works do not discuss the class A/B distinction, but I take it
that both these proposals, in view of the main observations of this article,
can be viewed as accounts not just of superlative quantifiers, but of class
B members in general. As suggested in section 5, my proposal is in certain
respects quite close to Krifka’s. It differs greatly, however, from Geurts &
Nouwen 2007 in the way the interaction between modified numerals and
modality is accounted for. In a way, the current article as well as Krifka
2007b represent a position where quantifiers lexically specify quite minimal
functions, which consequently leads to much of the work being done by
pragmatic mechanisms (such as blocking). For the proposal in Geurts &
Nouwen 2007, on the other hand, the balance is different in that a much
greater burden is placed on semantics. An in-depth comparison of these
accounts of class B quantifiers, however, is left for further research.
References
Aloni, Maria. 2007. Free choice, modals, and imperatives. Natural Language
Semantics 15(1). 65–94. doi:10.1007/s11050-007-9010-2.
Atlas, Jay David & Stephen C. Levinson. 1981. It-clefts, informativeness, and
3:38
Two kinds of modified numerals
3:39
R.W.F. Nouwen
3:40
Two kinds of modified numerals
doi:10.1007/s11050-008-9034-2.
Nouwen, Rick. 2009. Two kinds of modified numerals. In T. Solstad &
A. Riester (eds.), Proceedings of Sinn und Bedeutung 13, Available at http:
//www.let.uu.nl/~Rick.Nouwen/personal/papers/sub09.pdf, 15 pages.
Nouwen, Rick. 2010a. Two puzzles of requirement. In Maria Aloni & Katrin
Schulz (eds.), The Amsterdam Colloquium 2009, Springer. http://www.
hum.uu.nl/medewerkers/r.w.f.nouwen/papers/neccsuff.pdf.
Nouwen, Rick. 2010b. What’s in a quantifier? In Martin Everaert, Tom Lentz,
Hannah de Mulder, Øystein Nilsen & Arjen Zondervan (eds.), The linguistic
enterprise: From knowledge of language to knowledge in linguistics (Lin-
guistik Aktuell/Linguistics Today 150), John Benjamins. Pre-published
version available at http://www.hum.uu.nl/medewerkers/r.w.f.nouwen/
papers/wiaq.pdf.
van Rooij, Robert. 2004. Signalling games select Horn strategies. Linguistics
and Philosophy 27(4). 493–527. doi:10.1023/B:LING.0000024403.88733.3f.
Schwager, Magdalena. 2005. Exhaustive imperatives. In Paul Dekker & Michael
Franke (eds.), Proceedings of the 15th Amsterdam Colloquium, Universiteit
van Amsterdam.
Solt, Stephanie. 2007. Few more and many fewer: complex quantifiers based
on many and few. In Rick Nouwen & Jakub Dotlacil (eds.), Proceedings of
the ESSLLI2007 Workshop on Quantifier Modification, .
Takahashi, Shoichi. 2006. More than two quantifiers. Natural Language
Semantics 14(1). 57–101. doi:10.1007/s11050-005-4534-9.
Umbach, Carla. 2006. Why do modified numerals resist a referential in-
terpretation? In Proceedings of SALT 15, 258 – 275. Cornell University
Press.
Zimmermann, Thomas Ede. 2000. Free choice disjunction and epis-
temic possibility. Natural Language Semantics 8(4). 255–290.
doi:10.1023/A:1011255819284.
3:41
Semantics & Pragmatics Volume 3, Article 4: 1–42, 2010
doi: 10.3765/sp.3.4
Iffiness∗
Anthony S. Gillies
Rutgers University
Abstract
How do ordinary indicative conditionals manage to convey conditional in-
formation, information about what might or must be if such-and-such is
or turns out to be the case? An old school thesis is that they do this by
expressing something iffy: ordinary indicatives express a two-place condi-
tional operator and that is how they convey conditional information. How
indicatives interact with epistemic modals seems to be an argument against
iffiness and for the new school thesis that if -clauses are merely devices for
restricting the domains of other operators. I will make the trouble both clear
and general, and then explore a way out for fans of iffiness.
1 An iffy thesis
One thing language is good for is imparting plain and simple information:
there is an extra chair at our table or we are all out of beer. But — happily — we
∗ This paper has been around awhile, versions of it circulating since 05.2006 and accruing
a lot of debts of gratitude along the way. Chris Kennedy, Jim Joyce, Craige Roberts, Josef
Stern, Rich Thomason, audiences at the Rutgers Semantics Workshop (October 2007), the
Michigan L&P Workshop (Lite Version, November 2007), the Arché Contextualism & Relativism
Workshop (May 2008), the University of Chicago Semantics & Philosophy Language Workshop
(March 2009), and — especially (actually, especially∗ ) — Josh Dever, David Beaver, Kai von
Fintel, Brian Weatherson, and the anonymous S&P referees have all done their best trying
to save me from making too many howlers. But too many is surely context dependent, so
caveat emptor. This research was supported in part by the National Science Foundation
under Grant No. BCS-0547814.
©2010 A. S. Gillies
This is an open-access article distributed under the terms of a Creative Commons Non-
Commercial License (creativecommons.org/licenses/by-nc/3.0).
A. S. Gillies
do not only exchange plain information about tables, chairs, and beer mugs.
We also exchange conditional information thereof: if we are all out of beer, it
is time for you to buy another round. That is very useful indeed.
Conditional information is information about what might or must be, if
such-and-such is or turns out to be the case. My target here has to do with
how such conditional information manages to get expressed by indicative
conditionals (not so called because anyone thinks that’s a great name but
because no one can do any better). Some examples:
(1) a. If the goat is behind door #1, then the new car is behind door #2.
b. If the No. 9 shirt regains his form, then Barça might advance.
c. If Carl is at the party, then Lenny must also be at the party.
4:2
Iffiness
operator properly so called. But that is the gist: iffiness — a.k.a. the operator
view — is the thesis that ordinary indicative conditionals manage to express
conditional information because if expresses a conditional operator.
Depending on your upbringing, the operator view of if may well seem
either obvious or obviously wrongheaded. More on that below. Either way,
it is a hard line to maintain: how conditional sentences play with epistemic
modals seems to refute it. A seeming refutation isn’t quite the same as an
actual one, though. I will show that the refutation isn’t quite right by showing
how fans of iffiness can account for what needs accounting for. But before
showing how the operator view can be made to account for how if s and
modals interact I want to make it look for all the world like it can’t be done.
The operator view is an old school story about indicatives. It says that if
expresses some relation between the (semantic value of the) antecedent and
consequent. So if takes its place alongside other connectives and expresses
an operator — the same operator — on the semantic values of the sentences it
takes as arguments.2 To tell a story like this we have to say exactly what that
operator is. But not just any telling will do. I want to show how our simple
examples cause what looks like insurmountable trouble (doom, even) for any
version of the operator view. Here’s an informal sketch of the trouble, what
rides on it, and how — eventually — we can and ought to get out of the mess.
Take this sketch as a promissory note that a formally precise version of all
that can be given; the rest of the paper makes good on that.
Suppose if expresses the limit case conditional operator of material
implication. Iffiness requires that in sentences like (1b) and (1c) either the
epistemic modals outscope the conditionals or the conditionals outscope
the modals. Neither choice gets the truth conditions right if the conditional
operator is the horseshoe. That’s easy to see (and well known).3 Linguists
grow up on arguments like that. That is one reason why even though the
operator view is the first thing a logician thinks of, it is the last thing a
linguist does.
2 If is a little word with a big history — a big history that we can’t adequately tour here. But
there are guides for hire: for instance, Bennett (2003) and von Fintel (2009).
3 The material conditional analysis of ordinary indicatives is defended (in somewhat different
ways) by, for example, Grice (1989), Jackson (1987), and Lewis (1976). A textbook version of
this “no-scope” argument that has the horseshoe analysis as its target appears in von Fintel
& Heim 2007.
4:3
A. S. Gillies
But (as I’ll show) this very same trouble holds no matter what conditional
operator an iffy story says if expresses. To see that requires two things. First,
we need to say in a precise way what counts as a conditional operator (Section
4). Given some pretty weak assumptions iffiness requires that if means all
(well, all relevant). Second, there are some characteristic Facts about how
indicatives and epistemic modals interact (Section 5). These neatly divide:
there are some consistency facts and there are some intuitive entailment
facts. The operator view requires that either the conditionals outscope the
modals or the modals outscope the conditionals. Something general then
follows: no matter what conditional operator we say if expresses, one scope
choice is ruled out by the consistency facts, the other by the entailments
(Section 6).
That seems to be bad news for any fan of any version of the old school
operator view. And there seems to be more bad news in the offing since
the operator view isn’t the only game in town (in some circles, it’s a game
played only on the outskirts of town). The anti-iffiness rival — a.k.a. the
restrictor view — is a new school approach. It embraces Kratzer’s thesis that
if is not a connective at all: it doesn’t express an operator, a fortiori not
an iffy operator, and a fortiori not the same iffy operator in each of our
example sentences it figures in.4 Instead, says the restrictor analysis, if
simply restricts other operators. In the cases we will care about, it restricts
(possibly covert) epistemic modals. The restrictor view makes embarrassingly
quick work of the data that spells such trouble for the operator view (Section
7).
But the success of the restrictor analysis is no argument against Chuck
Taylors and skyhooks tout court. That’s because there are old school stories
that say that if expresses a strict conditional operator over possibilities
compatible with the context, and that it can do all the restricting that needs
doing (Sections 8). Once we see just how, we can look back and see more
4 The restrictor view gets its inspiration from Lewis’s (1975) argument that certain if s (under
adverbs of quantification) cannot be understood as expressing some conditional but rather
serve to mark an argument place in a polyadic construction. Kratzer’s thesis is that this holds
for if across the board. The classic references are Kratzer 1981, 1986. There is another rival,
too: some take if to be an operator, but an operator that does not (when given arguments)
express a proposition (Adams 1975; Gibbard 1981; Edgington 1995, 2008). Instead, they say,
if s express but do not report conditional beliefs on the part of their speakers. I will ignore
this view here: it doesn’t really start off as the most plausible candidate, the trouble I make
here about how if s and modals interact makes it less plausible not more, and it will just take
us too far afield.
4:4
Iffiness
clearly what is at stake in the difference between new school and old, why
iffiness is worth pursuing (Section 9), and how this version of the old school
story relates to recent dynamic semantic treatments (Section 10).
3 Ground rules
Let’s simplify. Assume that meanings get associated with sentences by getting
associated with formulas in an intermediate language that represents the
relevant logical forms (lfs) of them. Thus a story, old school or otherwise,
has to first say what the relevant lfs are and then assign those lfs semantic
values.
We will begin with an intermediate language L that has a conditional
connective that will serve to represent the lfs of ordinary indicatives. So let
L be generated from a stock of atomic sentence letters, negation (¬), and
conjunction (∧) in the usual way. But L also has the connective (if ·)(·),
and the modals must and might. What I have to say can be said about
an intermediate language that allows that the modals mix freely with the
formulas of the non-modal fragment of L but restricts (if ·)(·) so that it
takes only non-modal sentences in its first argument. So assume that L is
such an intermediate language. When these restrictions outlive their utility,
we can exchange them for others.5
Iffiness requires that the if of English expresses something properly iffy.
That leaves open just which conditional operator we say that the if of English
means. But our choices here are not completely free, and some ground rules
will impose some order on what we may say. These will constrain our choice
by saying what must be true for a conditional operator to be rightfully so
called. But before getting to that, I’ll start with what I will assume about
contexts.
First, a general constraint: assume that truth-values — for the if s and
the modals (when we come to that), as well as for the boolean fragment of
L — are assigned at an index (world) i with respect to a context. I will assume
that W , the space of possible worlds, is finite. Nothing important turns on
this, and it simplifies things.
For the fragment of L with no modals and no if s, contexts are idle. It will
be the job of the modals to quantify over sets of live possibilities and the job
5 Conventions: p, q, r , . . . range over sentences of L (subject to our constraints on L); i, j, k, . . .
range over worlds; and P , Q, R, . . . range over sets of worlds. And let’s not fuss over whether
what is at stake is the ‘if ’ of English or the ‘if ’ of L; context will disambiguate.
4:5
A. S. Gillies
of contexts to select these sets of worlds over which the modals do their job.
What I want to say can be said in a way that is agnostic about just what kinds
of things contexts are: all I insist is that, given a world, they determine a set
of possibilities that modals at that world quantify over.6 The functions doing
the determining need to be well-behaved.
Given a context c — replete with whatever things contexts are replete
with — an epistemic modal base C determined by it is just what we need:
4:6
Iffiness
4 Conditional operators
with Ci .
4:7
A. S. Gillies
4:8
Iffiness
comes not from D.K. but from C.I. You thus take if to be strict implication
(restricted to C). But that, too, can be put in terms of orderings: your ordering
i is universal, treating all worlds the same. Whence it follows that — since
the nearest p-world is the same distance from i as is every world — taking
Di to be the set of possibilities no further from i as the nearest p-world
amounts to taking Di to be the set of all worlds W , restricted by Ci .
Example 3 (material conditional). Suppose you are smitten by truth-tables,
and your favorite incarnation of the operator view is the material conditional
story. Equivalently: you will have a maximally discerning ordering (every
world an island) and take Di to be the set of closest worlds to i simpliciter
according to that ordering. For an if at i you will thus take Di to be {i}.
(For an if at some other world j, even an if with the same antecedent and
consequent as the one at i, take Dj to be j .)
Summing this all up: even before taking a stand on just what relation
between relevant antecedent possibilities and consequent possibilities that if
must express in order to express a conditional operator properly so called,
we know that it must still express such a relation. So let’s insist that we
can put things that way, parametric on just how Di gets picked out and
so parametric on what counts as “relevant” antecedent possibilities and so
parametric on the details of your favorite theory:
4:9
A. S. Gillies
Such R’s are precisely those for which the set of Q’s a Di ∩ P bears it to
form a filter that contains P .11 That is an aesthetic reason for constraining
R this way. Such R’s also jointly characterize the basic conditional logic.12
The relational properties correspond to reflexivity, right upward monotonic-
ity, and conjunction. That is another — only partly aesthetic — reason for
constraining them this way.
Second, R must care about consequents. This is just the requirement that
conditional relations, like quantifiers, be active:
4:10
Iffiness
example, perhaps insisting that it is the closest worlds in P to i that must bear
R to Q. If we systematically swap possibilities for possibilities in a way that
preserves the relevant structure, then the conditional relation ought to hold
pre-swapping iff it holds post-swapping. And mutatis mutandis for Di : since
once the posited structure does its job determining Di , then any systematic
swapping of possibilities that leaves the domain untouched should also leave
the conditional relation untouched.13
Where π is such a mapping and P a set of worlds, let π (P ) be the set of
worlds i such that π (j) = i for some j ∈ P . Then:
4:11
A. S. Gillies
The intuitive version is just this: if R holds between Di ∩ P and Q then the
former must be included in the latter. That is because if things didn’t go that
way then the witnessing counterexample world could play the role of any
one of the confirming worlds. But that would mean that confirming worlds
4:12
Iffiness
5 Three facts
(2) Red might be in the box and Yellow might be in the box.
So, if Yellow isn’t in the box, then Red must be.
And if Red isn’t in the box, then Yellow must be.
Conjunctions of epistemic modals like Red might be in the box and Yellow
might be in the box are especially useful when the bare prejacents partition
the possibilities compatible with the context. The first fact is simply that if s
are consistent with such conjunctions of modals.
4:13
A. S. Gillies
I do not know whether Carl made it to the party. But wherever Carl goes,
Lenny is sure to follow. So if Carl is at the party, Lenny must be — Lenny is at
the party, if Carl is. We just glossed an if with a commingling epistemic must
by a bare if with no (overt) modal at all. Thus:
This pair has the ring of (truth-conditional) equivalence. Fact 2 below records
that. But there are also arguments for thinking that the truth-value of (3a)
should stand and fall with the truth-value of (3b).
For suppose that such if s validate a deduction theorem and modus
ponens, and that must is factive.16 The left-to-right direction: assume that
(3a) is true. And consider the argument:
The first two sentences — intuitively speaking — entail the third. And that is
pushed on us by the assumptions: from the first two sentences we have (by
modus ponens) that Lenny must be at the party, which by factivity entails
Lenny is at the party. Apply the deduction theorem and we have that If Carl
is at the party, then Lenny must be at the party entails If Carl is at the party,
then Lenny is at the party. Since we have assumed that (3a) is true, it follows
that (3b) must be. There are spots to get off this bus to be sure — by denying
either modus ponens or by denying the factivity of must — but those costs
are high.17
The right-to-left direction: assume that (3b) is true and consider:
16 Remember that, for now, we are dealing with properties of sentences of (quasi-)English not
properties of those sentences’ lfs in some regimented language. The argument here isn’t
meant to convince you of Fact 2, it is meant to make some of the costs of denying the data
vivid. Geurts (2005) also notes that bare conditionals and their must-enriched counterparts
are “more or less equivalent”.
17 You have to troll some pretty dark corners of logical space for deniers of modus ponens,
but that’s not true for deniers of the factivity of must. That view has something of mantra
status among linguists (philosophers are surprised to hear that). Mantra or not, it is wrong.
For an all-out attack on it see von Fintel & Gillies 2010. Here is just one sort of consideration:
if must p didn’t entail p (because must is located somewhere below the top of the scale of
epistemic strength), then you’d expect must to combine with only in straightforward ways
the way might can:
4:14
Iffiness
Fact 2 (if/must). Conditional sentences like these are true in exactly the
same scenarios:
i. if S1 , then must S2
ii. if S1 , then S2
The glossing that this pattern permits is a nifty trick. But that is only half
the story since if can also co-occur with epistemic might. The interaction
between if and might is different and underwrites a different glossing.
Alas, my team are not likely to win it all this year. It is late in the season
and they have made too many miscues. But they are not quite out of it. If
they win their remaining three games, and the team at the top lose theirs,
my team will be champions. But our last three are against strong teams
and their last three are against cellar dwellers. Still, my spirits are high:
if we win out, we might win it all. Put another way, within the (relevant)
my-team-wins-out possibilities — of which there are some — lies a my-team-
wins-it-all possibility; there is a my-team-wins-out possibility that is a my-
team-wins-it-all possibility. But that is just to say that there are (relevant)
my-team-wins-out-and-wins-it-all possibilities. Maybe not very many, and
maybe not so close, but some.18
Apart from keeping hope alive, the example also illustrates that we can
gloss an indicative with a co-occurring epistemic might by a conjunction
under the scope of might:
But it doesn’t.
18 For the record: the Cubs. Please don’t bring it up.
4:15
A. S. Gillies
That gloss sounds pretty good. And for good reason: conjunctions that you
would expect to be happy if the truth of (6a) and (6b) could come apart are
not happy at all:
(7) a. #If my team wins out, they might win it all; moreover, they can’t win
out and win it all.
b. #It might turn out that my team wins out and wins it all, and, in
addition there’s no way that if they win out, they might win it all.
That gives us the third Fact about how if s play with modals.19
Fact 3 (if/might). Sentences like these are true in exactly the same scenarios:
i. if S1 , then might S2
ii. it might be that [S1 and S2 ]
It’s now a matter of telling some story, iffy or otherwise, that answers to
these Facts. Old school operator views will have trouble with them; the new
school restrictor view predicts them trivially.
6 Scope matters
The operator view takes if to express an operator, an iffy operator, and the
same iffy operator no matter whether we have a co-occurring epistemic modal
or not and no matter whether the modal is must or might. In cases where
there is a modal, scope issues have to be sorted out. Take a sentence of the
form
The first is true, the second an overreaction. I intend, for now, to sweep this under the same
rug that we sweep the odd way in which Some smoke and get cancer/Some get cancer and
smoke don’t feel exactly equivalent even though Some is a symmetric quantifier if ever there
was one. (The rug in question seems to be the tense/aspect rug; similar considerations drive
von Fintel’s (1997) discussion of contraposition of bare conditionals.)
4:16
Iffiness
and let S10 (S20 ) be the L-representation for sentence S1 (S2 ), and modal the
L-representation for modal. We have a short menu of options for the relevant
lf for such a sentence — either the narrowscoped (9a) or the widescoped (9b):
If you want to put your lfs in tree form, be my guest: opting for nar-
rowscoping means opting for sisterhood between modal and S2 ; opting for
widescoping means opting for sisterhood between modal and if S1 then S2 .
The trouble for the operator view is that, since if has to express inclusion,
neither choice will do. One choice for scope relations seems ruled out by
consistency (Fact 1), the other by if/must (Fact 2) and if/might (Fact 3).
To put the trouble precisely, we need one more ground rule. Contexts,
we said, have the job of determining the domains the modals quantify over.
Modals, I’ll assume, do their job in the usual way by expressing their usual
quantificational oomph over those domains: must (at i, with respect to C)
acts as a universal quantifier, and might as an existential quantifier, over Ci .
Now suppose we plump for narrowscoping. Then, given the ground rules,
we cannot predict the consistency of the likes of (2) and that means that we
cannot square iffiness with Fact 1. That’s true no matter how you fill in the
particulars of the iffy story.
Here is the narrowscoped analysis of my lost marbles. We have a modal
and two indicatives:
(10) a. Red might be in the box and Yellow might be in the box.
might p ∧ might q
b. If Yellow isn’t in the box, then Red must be.
if ¬q must p
c. If Red isn’t in the box, then Yellow must be.
if ¬p must q
Any good story has to allow that the bundle of if s in (10b) and (10c) is
consistent with the conjunction in (10a). But, assuming narrowscoping,
4:17
A. S. Gillies
Proof. Suppose otherwise — that the regimented formulas in L are all true at
a live possibility, say i, with respect to C. Just one of my marbles is in the
box. So any world in Ci is either a p-world or a q-world, but not both; C is
well-behaved, so i ∈ Ci . That leaves two cases.
case 1: i ∈ ¬q. By hypothesis if ¬q must p C,i = 1, and so Di ∩
¬qC ⊆ must pC . Since i ∈ Di , it then follows that i ∈ must pC — which
is to say must pC,i = 1. Thus Ci has only p-worlds in it. But that is at
odds with the second conjunct of (10a): that might q is true at i guarantees a
q-world, hence a ¬p-world, in Ci .
case 2: i ∈ ¬p. By hypothesis if ¬p must q C,i = 1, and so Di ∩
¬pC ⊆ must qC . Since i ∈ Di , it then follows that i ∈ must qC — which
is to say must qC,i = 1. Thus Ci has only q-worlds in it. But that is at odds
with the first conjunct of (10a): that might p is true at i guarantees a p-world,
hence a ¬q-world, in Ci .
Narrowscoping has the virtue of taking plain and simple lfs to represent
indicatives with apparently epistemic modalized consequents. But it has the
vice of not squaring with consistency. This is true no matter the particulars
of your favorite version of the operator view.20
So suppose instead that co-occurring modals scope over the if -constructions
in which they occur. Now it is the generalizations if/must and if/might
that cause trouble. Again, that’s true no matter how Di is chosen and so
no matter what counts as an if -relevant possibility and so no matter what
conditional operator we say if expresses.
Here is a widescope analysis of the key examples (3) and (6):
20 Thus by supplying how your favorite version of the operator view says Di is determined, you
can use this proof to show how that story (assuming narrowscoping) departs from Fact 1.
4:18
Iffiness
4:19
A. S. Gillies
and so it is not true that the plain if is true at every world in Ci and so
must if p q C,i = 0.
Again, this is true no matter how we fill in the particulars of the operator
view. If we widescope the modals, and the story is chauvinistic, it will not
square with Fact 2.
Given widescoping, egalitarianism fares no better. But here it is
if/might (Fact 3) that causes trouble. This time the issue is triviality: must-
enriched if s are true iff their might-enriched counterparts are.
Here is why. First, egalitarianism implies that Di covers Ci :
that is only because he requires that i induce a total order that is centered pointwise on
i, and that rules against absoluteness. But the pragmatic mechanisms he develops there
are agnostic on the chauvinism question — what he says about how the context constrains
selection functions is compatible with both egalitarianism and chauvinism. I myself see
little reason to go for chauvinism.
4:20
Iffiness
Given widescoping, any story with this equivalence will have a hard time
saying why conditionals like (12a) seem to be true iff modalized conjunctions
like (12b) are and so will have trouble with if/might. That is because, given
the usual story for the modals (Definition 6.1), we get triviality:
Thus widescoping plus egalitarianism implies that must if p q is true
iff might(p ∧ q) is. Not even Cubs fans fall for that.
22 Strictness makes it easy to understand why negating a bare conditional sounds so much
like saying the counterexample might obtain. For more on context-dependent strictness
(of different flavors) see, e.g., Veltman 1985, von Fintel 1998a, 2001, and Gillies 2004, 2007,
2009.
23 Thus, given well-behavedness (Definition 3.2), explaining Fact 2 is easy for widescoping
egalitarians: if p q is equivalent to must (p ⊃ q) which, given well-behavedness, is
equivalent to must must (p ⊃ q). And that, in turn, is equivalent to must if p q .
4:21
A. S. Gillies
7 Iffiness lost
Do we widescope or narrowscope these? What principled story is there that predicts, rather
than stipulates, that the first is widescoped and the second narrowscoped? Third because
as soon as we consider epistemic modals that lie between the existential might and the
universal must — like probably and unlikely — it is doomed to failure anyway.
4:22
Iffiness
Always
(13) Sometimes if a man owns a donkey, he beats it.
Never
The job of the if -clause in (13) is merely to restrict the domain over which
the adverb (unselectively) quantifies, and allegedly that restricting job is a
job that cannot be done by treating if as a conditional connective with a
conditional operator as its meaning. If Q-adverb is universal, maybe an iffy
if will work; but if it is existential, then conjunction does better. I want to set
the issue about adverbial (and adnomial, for that matter) quantifiers aside for
two reasons. First because I doubt the allegation sticks. But that is another
argument for another day.25 And second because it will do us good to focus
on simple cases.
Still, the trouble for the operator view that is center stage here does look
quite a lot like the problem Lewis pointed out. We have to make room for
interaction between if -clauses and the domains our modals quantify over.
But that interaction is tricky. That is because it looks impossible to assign
if the same conditional meaning — thereby taking its contribution to be an
iffy one — in all of our examples. Indeed, when the modal is universal a con-
ditional relation looks good; but when the modal is existential, conjunction
looks better. This is pretty much the same trouble Lewis saw for if s occurring
under adverbs of quantification, and led him to conclude that such if s do not
express operators at all (and a fortiori not conditional operators).26 Just as
with adverbial quantifiers, there is a fast and easy solution to the problem
if we get rid of the old school idea that if is a conditional connective and
plump instead for anti-iffiness. The most forceful way of putting the anti-iffy
thesis is Kratzer’s (1986: 11):
25 There are ways to get the restricting job done after all. The operator-based stories in, e.g.,
Belnap 1970, Dekker 2001, and von Fintel & Iatridou 2003 all manage.
26 For recent and more thorough-going defenses of if s-as-quantifier-restrictors see, e.g., Kratzer
1981, 1986 and von Fintel 1998b. But see Higginbotham 2003 for a dissenting view.
4:23
A. S. Gillies
The thesis is that the relevant structure for the conditionals at issue here
is not some modal scoped over a conditional nor some conditional with a
modal in its consequent, but is instead something like
The job of the if -clause is to restrict the domain over which the modal
quantifies. So instead of searching for a conditional operator properly so
called that if contributes whether it commingles with a modal or not, we
search for an operator for if to restrict. And, for indicative conditionals,
we do not have to search far: the operators are (possibly covert) epistemic
modals.27
So it is the modals, not the if s, that take center stage. They have logical
forms along the lines of modal(p)(q), with the usual quantificational force:
This plus two assumptions gets us the now-standard and familiar restrictor
view. It easily accounts for consistency (Fact 1), if/must (Fact 2), and
if/might (Fact 3).
First assumption: assume that when there is no if -clause and so no
restrictor is explicit — as in Blue might be in the box or Yellow must be in
the box — the first argument in the lf of the modal is filled by your favorite
tautology (>). In those cases there is nothing to choose between an analysis
that follows our earlier Definition 6.1 and an analysis that follows Definition
27 Officially, our intermediate language now also goes in for a change. L had one-place modals
might and must and a two-place connective (if ·)(·). That won’t do to represent the restrictor
view. Instead, we need the two-place modals might (·)(·) and must (·)(·) and have no need
for a special conditional connective that expresses a conditional operator.
4:24
Iffiness
(17) a. Red might be in the box and Yellow might be in the box.
might (>)(p) ∧ might (>)(q)
b. If Yellow isn’t in the box, then Red must be.
must (¬q)(p)
c. If Red isn’t in the box, then Yellow must be.
must (¬p)(q)
It’s modals all the way down. And the modals can all be true together.
Proof. I am in i and there are just two worlds compatible with the facts I
have, i and j. The first is a (p ∧ ¬q)-world, the second a (q ∧ ¬p)-world.
The restrictors in (17a) are trivial, so it is true at i iff Ci has a p-world in
it and a q-world in it; i witnesses the first conjunct, j the second. The
restricting if -clause of (17b) makes sure that the must ends up quantifying
only over the ¬q-worlds compatible with C: (17b) is true at i iff all of the
worlds Ci ∩ ¬q are p-worlds. And the only one, i, is. Similarly for the must
in (17c): it quantifies over the ¬p-worlds in Ci , checking to see that they are
all q-worlds.
It is just as easy to square this picture with if/must (Fact 2) and if/might
(Fact 3). Here are the examples with their new school lfs:
4:25
A. S. Gillies
Proof. anti-iffiness assigns the same lf to a bare conditional like (18b) and
its must-enriched counterpart (18a): must (p)(q). It would thus be hard, and
pretty undesirable, for their truth conditions to come apart. That explains
if/must.
Now consider the if -as-restrictor analysis of the sort of examples behind
if/might in (19). If (19b) is true at i in C then Ci has a (p ∧ q)-world in it.
But then that same world must be in Ci ∩ p. It is a q-world, and that will
witness the truth of (19a) at i. Going the other direction: if (19a) is true at
i in C, then there are some q-worlds in Ci ∩ p. Any one of those will do
as a (p ∧ q)-world in Ci , and that is sufficient for (19b) to be true at i. That
explains if/might.
These explanations are easy. And, given the trouble for the operator
view, it looks like the only game in town is to say that if doesn’t express an
operator and so not an iffy operator. That stings.
8 Iffiness regained
4:26
Iffiness
4:27
A. S. Gillies
got done.
So far this isn’t a story about the meaning of if (much less an iffy one). It
is a blueprint for how to construct a semantics that gives a uniform and iffy
meaning to if s whether or not those if s mix and mingle with other operators.
To construct a story using it we need to take a stand on what it means to add
the information carried by an antecedent to the contextually relevant stock
of information. Taking that stand depends on the aspirations of the theory
since different constructions may depend on different sorts of contextually
available information and there is every reason to think that augmenting
information of different sorts goes by different rules. But our aspirations are
pretty modest here: how indicatives interact with epistemic modals. So we
can opt for an equally simple stand on what it means to add information to a
context.
Even before getting all the details laid out, we can see how the doubly
shifty behavior of if -clauses will be able to predict what needs predicting
about how indicatives and epistemic modals interact. The difference between
interpreting q against the backdrop of the prior context C and against the
backdrop of C + p is a difference that makes no difference if q has no context
sensitive bits in it. No wonder we missed it! But if q does have context
sensitive bits in it — like might or must, whose semantic value depends
non-trivially on C — then this is a difference that makes all the difference.
For example: consider a modal like must q. The contexts C and C + p may
well determine different sets of possibilities. Since must q depends exactly
on whether that set of possibilities has only q-worlds in it, we then get
a difference. Thus if must q is the consequent of an indicative, context-
shiftiness matters.
Here is the simplest way of constructing a semantics around the blueprint:
4:28
Iffiness
at worlds other than i. It is context-shifty since the truth of if p q in C
depends on the truth of the constituent q in contexts other than C.
The if /modal interactions that were such trouble were only trouble be-
cause we forgot to keep track of the context-shifting job of if -clauses. And
doing that, even in the simple context-shifting in Definition 8.1, is enough to
make iffiness sit better with the Facts.
I know that just one of my marbles is in the box — either Red or Yel-
low — but do not know which it is. Narrowscope the modals. Then all of
these can be true together:
(20) a. Red might be in the box and Yellow might be in the box.
might p ∧ might q
b. If Yellow isn’t in the box, then Red must be.
if ¬q must p
c. If Red isn’t in the box, then Yellow must be.
if ¬p must q
Proof. Here is why. Suppose — for concreteness and without loss of general-
ity — that C contains just two worlds: i, a (p ∧ ¬q)-world and j, a (q ∧ ¬p)-
world. So (20a) is true at i.
Now take (20b). It is true at i in C, given iffiness + shiftiness, iff all the
possibilities in Ci ∩ ¬q are possibilities that must pC+¬q maps to true.
Thus we have to see whether the following holds:
The operator view isn’t at odds with consistency after all. It is also easy
to predict if/must (Fact 2) and if/might (Fact 3). Here are the narrowscoped
analyses of the motivating examples:
4:29
A. S. Gillies
4:30
Iffiness
9 What is at stake
Given the success of anti-iffiness why bother with iffiness at all? A fair
question. Given the context-shifting I’m advocating for fans of iffiness, what’s
the difference between old school and new school? Another fair question. I
owe some answers.
I make three (not wholly unrelated) claims. First, even if the shifty version
of the operator view and the basic version of the restrictor view covered the
same ground, there is still reason to explore the operator view. Second, the
views have different conceptual roots and different allegiances. Third, the
views don’t cover the same ground. I need to argue for each of these.
Suppose that — at least when it comes to accounting for data about the
sorts of constructions at issue here — there’s nothing to choose between
iffiness + shiftiness and anti-iffiness. Even under that assumption there
is reason to take this version of the operator view seriously. That is because
it is important to set the record straight. Maybe you don’t like skyhooks,
Chuck Taylors, and conditional connectives expressing iffy operators in your
lfs. It is important to know that whatever your reasons, it can’t be because
iffiness can’t be squared with the Facts about how if s and modals interact.
The Ramsey test intuition leads naturally to a story according to which
if expresses a bona fide conditional operator that captures the restricting
behavior of if -clauses. Thus the restricting behavior of if -clauses can be a
28 Before I said that I wanted to ignore issues about how this version of the operator view can
meet Lewis’s challenge about the ways if -clauses and adverbs of quantification interact,
saving that argument for another day. I want to stick to that (it really is an argument for
another day), but the general idea is straightforward. First, adjust the kinds of information
represented by a context so that we can sensibly quantify over individuals and the events
they participate in. Second, allow that quantificational domains can be restricted by material
in if -clauses — those domains play the role of the subordinate or derived context. Adverbs
of quantification appear under the conditional and have their usual denotations.
4:31
A. S. Gillies
part of, rather than an obstacle to, their expressing something iffy. That is
cool.
But what’s the real difference between the views? One view says we have
no conditional operator, just a complicated modal with a slot for a restrictor.
The other says we have a conditional operator but that its antecedent shifts
the context thereby acting like a restrictor. Tomato/tomăto, right? Wrong!
Here is one way of seeing that. Consider three indicatives:
Compare (23a) and (23c). The restrictor view says these have different modals
and different arguments for each of the slots in those modals. So, apart from
the fact that each is a modal expression of some flavor or other, there is
nothing much in common between the two. They are as different as Some
students smoke and All dogs bark: each is a quantificational expression of
some flavor or other. The operator view says something different. It says that,
despite their different antecedents and different consequents, they still share
a common iffy core: there is a conditional connective in common between
them and it contributes the same thing to each of the sentences it occurs in.
Or compare the must-enriched (23a) with its bare counterpart (23b). The
restrictor view says the bare indicative just is the must-enriched version
in disguise. That is how it predicts if/must (Fact 2). It thus treats bare
indicatives as a special case, dealt with by positing a covert and inaudible
necessity modal. Maybe there is reason to posit such an operator, and an
independent and principled reason to posit the necessity modal instead of an
existential one or some different modal with different quantificational force,
and maybe those reasons outweigh the cost of the positing. The operator
view adopts a very different stance here and that is what I want to point out.
It says that bare indicatives like (23b) are ordinary conditionals and their
counterparts with must-ed consequents like (23a) are ordinary conditionals
that happen to have must in their consequents. No special cases, no positing
of inaudible operators, and if/must comes out as a prediction not as a
stipulation. None of this is a knock-down argument for or against either of
the views — it’s not meant to be — but it does highlight their difference in
worldview.
All of this has been under the assumption that both the doubly shifty iffy
view and the anti-iffy restrictor view cover the same ground about how if s
4:32
Iffiness
and modals interact. But that’s not quite right.29 So far we have only worried
about how it is that a conditional sentence manages to express what might be
if such-and-such or how it manages to express what must be if such-and-such.
But conditional information can be more economically expressed than that.
We can just as well have a single conditional sentence that expresses what
must be and what might be if such-and-such.
A case in point: although I have lost my marbles, I know that some of
them — at least one of Red, Yellow, and Blue — are in the box. In fact I know
a bit more. I know that Yellow and Blue are in the same spot and so that Red
can’t be elsewhere if Yellow isn’t in the box. Another example: arriving at
the party, I’m not sure who’s there and who isn’t. I do know that Lenny goes
wherever Carl goes (but sometimes Lenny goes alone), but Monty never goes
where Lenny goes.
(24) a. If Yellow is in the box, then Red might be and Blue must be.
b. If Lenny is at the party, then Carl might be but Monty isn’t.
These are not exotic, each conditional is a true thing to say in the circum-
stances, and there is space for the iffy view and incarnations of the anti-iffy
restrictor view to differ on the truth conditions they assign to conditionals
like these — and so the two views can’t be stylistic variants.
Here is the issue: (24a) and (24b) have glosses:
(25) a. If Yellow is in the box, then Red might be and if Yellow is the box,
then Blue must be.
29 There are reasons independent of interaction with epistemic modals to think that anti-
iffiness, in its purest if -only-restricts form, can’t be the whole story. If it were, and if -clauses
and when-clauses have the same restricting behavior, then we wouldn’t expect differences in
cases like this:
(i) a. If the Cubs get good pitching and timely hitting after the break, they might win
it all.
b. When the Cubs get good pitching and timely hitting after the break, they might
win it all.
But we do detect a difference. I can say something true-if-hopeful with (ia). But (ib) passes
optimistic and heads straight for delusional. It’s hard to see where to locate the differ-
ence — whether it’s semantic or pragmatic — if the semantic contribution of if and when is
purely to mark the restrictor slot for the common operator might. (Lewis (1975) noticed
that sometimes a restricting if is odd when its corresponding restricting when is fine. But
he labeled these differences “stylistic variations”.) Some arguments along these lines are
pushed by von Fintel & Iatridou (2003).
4:33
A. S. Gillies
But the truth conditions of (26) do not match the truth conditions of (25a)
and so do not match the truth conditions of the original (24a): (26) is false in
the context as we set it up even though both (24a) and (25a) are true.
Now assume, instead, that a modal is overt iff it is pronounced — no
matter how arbitrarily deeply embedded. Then (26) isn’t the right anti-iffy
lf for (24a). Instead, we get something more sensible: (24a) and (25a) have
the same lf. There’s no in-principle problem with that.31 But what about
conditionals like (24b)? We don’t want to posit a must that outscopes the
pronounced might. So we have to posit a narrowscoped one. In order to
get the posited modal appropriately restricted — so that (24b) comes out
equivalent to (25b) — we have two obvious options. Option (i): Argue that
conditionals like those in (24) are not single conditionals at all, that they are
really conjunctions of two simple modals. That way there is no difference
at all between the conditionals in (24) and the glosses in (25). Option (ii):
Enrich our intermediate language to allow for explicit domain-restricting
variables, and provide a mechanism for the inheriting of those restrictions
30 In this sense, a modal is any (non-equivalent) stack of musts, mights, and negations.
31 Though it doesn’t come free: it puts strain on the process of assigning formulas of L to serve
as the lfs of sentences of natural language.
4:34
Iffiness
across intervening operators like conjunction. Both options are open, and
party line proponents of anti-iffiness are free to pursue them. But they do
require work. Option (i) posits movement we’d not like to have to posit, treats
conditionals with apparent conjoined consequents as yet another special
case, and describes rather than explains why the conditionals in (24) are
glossable by those in (25). Option (ii) requires more expressive resources
for L than we thought necessary and requires something over and above
the anti-iffy story as it stands to say when and how domain restriction gets
inherited over distance and across intervening operators. That’s not an
argument against this option but a description of it.32
But none of that really matters: my point was that iffiness + shiftiness
and anti-iffiness aren’t notational variants. And they are not: the iffy story
takes conditionals like (24) in perfect stride. No special cases, no positing
of inaudible operators, no stress on the parser in assigning formulas of
L to serve as the lfs of conditional sentences, no movement. We get the
right truth conditions, and we get as a prediction not a stipulation that the
conditionals in (24) are equivalent to those in (25).
Not every fan of old school iffiness will want to follow me this far. But there
is a cost to cutting their trip short since they must then deny or explain away
one of the Facts. Iffiness, they’ll no doubt point out, is not without its own
costs: the price of iffiness is shiftiness twice over.
I reply that there are costs and then there are costs. Embracing context-
shiftiness may be a cost, but I want to point out that it is not a new cost: it
makes the analysis here a broadly dynamic semantic account of indicatives.33
So shiftiness is a cost you may already be willing to bear. I want to (briefly)
point out how it is that this shiftiness amounts to a four-fold dynamic
perspective on modals and conditionals.
32 Something in the neighborhood of Option (ii) is developed (though not with an eye to
conjoined consequents) in von Fintel (1994). For a recent discussion see Rawlins 2008.
33 The general idea that consequents are evaluated in a subordinate or derived context is
standard in dynamic semantics — see, e.g., dynamic treatments of donkey anaphora (Groe-
nendijk & Stokhof 1991) or dynamic treatments of presupposition projection in conditional
antecedents and consequents (Heim 1992; Beaver 1999) or dynamic treatments of counter-
factuals (Veltman 2005; von Fintel 2001; Gillies 2007). But exploiting a derived context isn’t
quite a litmus test for dynamics since that is something shared by a lot of Ramsey-inspired
accounts, whether or not they count as ‘dynamic’.
4:35
A. S. Gillies
The version of the operator view I’m advocating for fans of iffiness takes
the truth of an indicative (at an index, in a context) to be doubly shifty.
That doubly shifty behavior makes the semantics dynamic in the sense that
interpretation both affects and is affected by the values of contextually
filled parameters. Whether if p q is true at i in C depends on C; the
indicative can be true at i for some choices of C and false at i for others. So
interpretation is context-dependent. Whether if p q is true at i in C also
depends on the subordinate context C + p. Interpreting the indicative in C
affects — temporarily — the context for interpreting some subparts of it. So
interpretation is also context-affecting.
This analysis is also dynamic in a second sense. It makes certain sentences
unstable — the truth-value a sentence gets in a context C is not a stable or
persistent property since it can have a different truth-value in a context C 0
that contains properly more information.
The boolean bits are, of course, both t- and f -persistent and so persistent full-
stop. But not the modals: might, being existential, is f - but not t-persistent;
must goes the other way. And since if is a strict conditional, equivalent to a
necessity modal scoped over a material conditional, its pattern of persistence
is just like that for must.34
These two senses in which the story is dynamic are two sides of the same
coin. Together they explain how it is that the narrowscoped conditionals
if ¬p must q and if ¬q must p are consistent with the partitioning
modals in might p ∧ might q. From the fact that i ∈ if ¬p must q C and
i ∈ ¬pC it does not follow that i ∈ must qC . Indeed, with my marbles
lost, this is sure to be false at i in C since might p is true. What is true at i is
that — in the subordinate or derived context C + ¬q — must q is true. That
is allowed because must isn’t f -persistent. But that is not at odds with the
might claim. And mutatis mutandis for the other if .
34 This pattern makes the treatment of indicatives here similar in some respects to Veltman’s
(1985) data semantic treatment of indicatives. But there are important differences between
the two stories. Here’s one: if p might q is data semantically equivalent to if p q .
That won’t do given Fact 3.
4:36
Iffiness
So we have dynamics twice over. But so far none of this looks quite
like what is usually called “dynamic semantics”. In that sense of dynamics
meaning isn’t associated with truth conditions or propositions but with
context change potentials, effects on relevant states of information. Take
an information state s to be a set of worlds, and say that what a sentence
means is how its lf updates information states. That assigns to sentences
the semantic type usually reserved for programs and recipes; they express
relations between states — intuitively, the set of pairs of states such that
executing the program in the first state terminates in the second. We can
think of all sentences in this way, thereby treating them as instructions for
changing information states. Thus: the meaning of a sentence p is how it
changes an arbitrary information state. We might put that by saying the
denotation [p] applied to s results in state s 0 ; in post-fix notation s[p] = s 0 .35
Now say that p is true in s iff s[p] = s, for then the information p carries is
already present in s.36
Having gone this far, we can make good on the Ramsey test this way:
Some programs have as their main point to make such-and-such the case;
others to see whether such-and-such. Programs of the latter type are tests
and they either return their input state (if such-and-such) or fail (otherwise).
That is the kind of program Definition 10.2 says if is.37 It says an if tests
s to see whether the consequent is true in s[p]. But — in good Ramseyian
spirit — s[p] is just the subordinate context got by hypothetically adding p
to s. Truth isn’t persistent here, either. That is because a state may pass a
test posed by an existential (Are there p-possibilities?) and yet have
35 For the fragment without if s the updates are as you would expect (Veltman 1996). For the
if -free fragment of L, define [·] as follows:
i. s[patomic ] = i ∈ s : i(patomic ) = 1
ii. s[¬p] = s \ s[p]
iii. s[p ∧ q] = s[p][q]
iv. s[might p] = i ∈ s : s[p] 6=
It then follows straightaway that — for the if - and modal-free fragment — s[p] = s ∩ p.
36 This generalizes the plain vanilla story about satisfaction we were taught when first learning
propositional logic: as the story usually goes, a boolean p is true relative to a set of
possibilities s iff all the possibilities in s are in p. But that is equivalent to saying that
adding p to the information in s produces no change: s ∩ p = s iff s ⊆ p.
37 See, e.g., Gillies 2004.
4:37
A. S. Gillies
4:38
Iffiness
11 An iffy upshot
4:39
A. S. Gillies
References
4:40
Iffiness
4:41
A. S. Gillies
Groenendijk, Jeroen & Martin Stokhof. 1991. Dynamic predicate logic. Lin-
guistics and Philosophy 14(1). 39–100. doi:10.1007/BF00628304.
Heim, Irene. 1992. Presupposition projection and the semantics of attitude
verbs. Journal of Semantics 9(3). 183–221. doi:10.1093/jos/9.3.183.
Higginbotham, James. 2003. Conditionals and compositionality. Philosophical
Perspectives 17(1). 181–194. doi:10.1111/j.1520-8583.2003.00008.x.
Jackson, Frank. 1987. Conditionals. Oxford University Press.
Kratzer, Angelika. 1981. The notional category of modality. In Hans-Jurgen
Eikmeyer & Hannes Rieser (eds.), Words, worlds, and contexts: New ap-
proaches in word semantics (Research in Text Theory 6), 38–74. Berlin: de
Gruyter.
Kratzer, Angelika. 1986. Conditionals. Proceedings of the Chicago Linguistics
Society [CLS] 22(2). 1–15.
Lewis, David. 1973. Counterfactuals. Cambridge, MA: Harvard University
Press.
Lewis, David. 1975. Adverbs of quantification. In Edward Keenan (ed.), Formal
semantics of natural language, 3–15. Cambridge University Press.
Lewis, David. 1976. Probabilities of conditionals and conditional probability.
The Philosophical Review 85(3). 297–315. doi:10.2307/2184045.
Rawlins, Kyle. 2008. (Un)Conditionals. Santa Cruz, CA: UC Santa Cruz disser-
tation.
Stalnaker, Robert. 1968. A theory of conditionals. In Nicholas Rescher (ed.),
Studies in logical theory (American Philosophical Quarterly Monograph
Series 2), 98–112. Blackwell.
Stalnaker, Robert. 1975. Indicative conditionals. Philosophia 5(3). 269–286.
doi:10.1007/BF02379021.
Veltman, Frank. 1985. Logics for conditionals. Amsterdam: University of
Amsterdam dissertation.
Veltman, Frank. 1996. Defaults in update semantics. Journal of Philosophical
Logic 25(3). 221–261. doi:10.1007/BF00248150.
Veltman, Frank. 2005. Making counterfactual assumptions. Journal of Se-
mantics 22(2). 159–180. doi:10.1093/jos/ffh022.
Anthony S. Gillies
Department of Philosophy
Rutgers University
thony@rci.rutgers.edu
4:42
Semantics & Pragmatics Volume 3, Article 9: 1–74, 2010
doi: 10.3765/sp.3.9
1 Introduction
9:2
Cross-linguistic variation in modality systems: The role of mood
quantificational conversational
force background
9:3
Lisa Matthewson
b. guy’t=ás=ka ti=sk’úk’wm’it=a
sleep=3sbjn=deon det=child=exis
‘I hope the child sleeps.’
I will show that the St’át’imcets subjunctive differs markedly from Indo-
European subjunctives, both in the environments in which it is licensed, and
in its semantic effects. I propose an analysis of the St’át’imcets subjunctive
which adopts insights put forward by Portner (1997, 2003). For Portner,
moods in various Indo-European languages place restrictions on the con-
versational background of a governing modal. I argue that the St’át’imcets
subjunctive mood can be analyzed within exactly this framework, with the
twist that in St’át’imcets, the restriction the subjunctive places on the gov-
erning modal obligatorily weakens the force of the proposition expressed.
This has an interesting consequence. While we can account for the
St’át’imcets subjunctive using the same theoretical tools as for Indo-European,
at a functional level the two languages are using their mood systems to
achieve quite different effects. In particular, St’át’imcets uses its mood sys-
tem to restrict modal force — precisely what this language does not restrict
via its lexical modals. At a functional level, then, we find the same kind of
cross-linguistic variation in the domain of mood as we do with modals. This
idea is illustrated in the simplified typology in Table 2:
These results suggest that while individual items in the realm of mood and
modality lexically encode different aspects of meaning, the systems as a
whole have very similar expressive power.
The structure of the paper: Section 2 introduces the St’át’imcets subjunc-
tive data. I first illustrate the nine different uses of the relevant agreement
paradigm, and then argue that this agreement paradigm is a subjunctive,
rather than an irrealis mood. Section 3 shows that the St’át’imcets sub-
junctive is not amenable to existing analyses of more familiar languages.
9:4
Cross-linguistic variation in modality systems: The role of mood
Section 4 reviews the basic framework adopted, that of Portner (1997), and
Section 5 provides initial arguments for adopting a Portner-style approach
for St’át’imcets. Section 6 presents the formal analysis, and Section 7 applies
the analysis to a range of uses of the subjunctive. Section 8 concludes and
raises some issues for future research.
indicative subjunctive
indicative nominalized
With transitive predicates, the situation is similar, except that there are
four separate paradigms, one of which is subjunctive.3,4
2 The cognate forms are often called ‘conjunctive’ in other Salish languages, primarily in order
to disambiguate the abbreviations for ‘subject’ and ‘subjunctive’. See for example Kroeber
1999.
3 The traditional terms for the first two columns are ‘indicative’ and ‘nominalized’ respectively.
The nominalized endings are identical to nominal possessive endings, and are glossed as
‘poss’ in the data. The choice between these first two paradigms is syntactically governed: the
so-called ‘indicative’ surfaces in matrix clauses and relative clauses, while the nominalized
paradigm appears in subordinate clauses. Both these sets contrast semantically, in all
syntactic environments, with the subjunctive, hence my overall categorization of the first
two paradigms as ‘indicative’.
4 See Kroeber 1999 and Davis 2000 for justification of the analysis of subject inflection
9:5
Lisa Matthewson
This use of the subjunctive is very restricted (see van Eijk 1997: 147).
Minimal pairs cannot usually be constructed for ordinary assertions, as
shown in (7)–(9).
9:6
Cross-linguistic variation in modality systems: The role of mood
When used with the deontic modal ka, in addition to the ‘wish’ interpre-
tation shown in (10)–(11), the subjunctive can also render a ‘pretend to be ...’
interpretation.6
6 The data in (12) are from the Upper St’át’imcets dialect; in Lower St’át’imcets, (12a) is
corrected to (i), which has the subjunctive but lacks the deontic modal. This independent
9:7
Lisa Matthewson
9:8
Cross-linguistic variation in modality systems: The role of mood
(16) a. kanem=lhkán=k’a
do.what=1sg.indic=infer
‘What happened to me?’
b. kanem=án=k’a
do.what=1sg.sbjn=infer
‘I don’t know what happened to me.’ / ‘I wonder what I’m doing.’8
The same effect arises with yes-no questions. In combination with the evi-
dential k’a or a future modal, the subjunctive also turns these into statements
of uncertainty which are often translated using ‘maybe’ or ‘I wonder’.
8 For expository reasons, k’a was glossed as ‘epistemic’ in (3a) above, but from now on will be
glossed as ‘inferential’. Matthewson et al. (2007) analyze k’a as an epistemic modal which
carries a presupposition that there is inferential evidence for the claim.
9:9
Lisa Matthewson
9:10
Cross-linguistic variation in modality systems: The role of mood
9:11
Lisa Matthewson
These are all the cases where the subjunctive has a semantic effect; in
the next sub-section we will also see some cases where the subjunctive is
obligatory and semantically redundant. I will not aim to account for the entire
panoply of subjunctive effects in one paper. However, the analysis I offer
will explain the first seven uses, setting aside for future research only the
two uses which involve the particle t’u7. See Section 8 for some speculative
comments about the subjunctive in combination with t’u7.
In this sub-section I justify the use of the term ‘subjunctive’ for the subject
agreements being investigated. The choice of terminology is intended to
reflect the fact that the St’át’imcets mood patterns with Indo-European sub-
junctives, rather than with Amerindian irrealis moods, in several respects.
However, we will see below that the St’át’imcets subjunctive also differs
9:12
Cross-linguistic variation in modality systems: The role of mood
9 This raises a terminological issue which arises in many areas of grammar. Should we apply
terms which were invented for European languages to similar — but not identical — categories
in other languages? For example, should we say ‘The perfect / definite determiner /
subjunctive in language X differs semantically from its English counterpart’, or should we
say ‘Language X lacks a perfect / definite determiner / subjunctive’, because it lacks an
element with the exact semantics of the English categories? I adopt the former approach
here, as I think it leads to productive cross-linguistic comparison, and because it suggests
that the traditional terms do not represent primitive sets of properties, but rather potentially
decomposable ones.
10 Palmer does not provide a definition of ‘non-assertion’. He observes that common reasons
why a proposition is not asserted are because the speaker doubts its veracity, because the
proposition is unrealized, or because it is presupposed (Palmer 2006: 3). See Section 3 below
for discussion.
9:13
Lisa Matthewson
9:14
Cross-linguistic variation in modality systems: The role of mood
9:15
Lisa Matthewson
(31) a. *táyt=kacw=an’
hungry=2sg.indic=perc.evid
‘You must be hungry.’
b. táyt=acw=an’
hungry=2sg.sbjn=perc.evid
‘You must be hungry.’
The vast majority of formal research on the subjunctive deals with Indo-
European. In languages such as the Romance languages, the subjunctive
mood is used for wishes, fears, speculations, doubts, obligations, reports,
unrealized events, or presupposed propositions. Some examples are provided
in (33)–(34).
9:16
Cross-linguistic variation in modality systems: The role of mood
9:17
Lisa Matthewson
worlds in the context set where the complement is false, these predicates
eliminate worlds in the context set which are low on an evaluative ranking.13
Thus, these predicates take the subjunctive:
Different rankings of these two constraints give rise to different mood choices in Romanian
vs. French for emotive factive predicates like ‘be sorry/happy’, ‘regret’. Emotive factives are
+Decided but -Assertive, and take the indicative in Romanian and the subjunctive in French.
14 Giannakidou (2009) proposes that the Modern Greek subjunctive complementizer na con-
tributes temporal semantics (introducing a ‘now’ variable). The generalization is still that
subjunctive appears in non-veridical contexts; see Giannakidou 2009 for details.
9:18
Cross-linguistic variation in modality systems: The role of mood
9:19
Lisa Matthewson
Jacobs (1992) analyzes the mood distinction in Skwxwú7mesh as encoding speaker certainty,
which suggests that it differs from the St’át’imcets mood system.
17 The expected subject inflection in the embedded clauses in (38) would actually be possessive
=s; see van Eijk 1997 and Davis 2006. However, many modern speakers prefer to omit the
possessive ending and to use matrix indicative =Ø in these contexts. This does not affect
the point at hand, as the variation is between two forms of indicative marking.
9:20
Cross-linguistic variation in modality systems: The role of mood
9:21
Lisa Matthewson
(40) for Catalan. Quer’s analysis of these examples involves a shifting of the
model in which the descriptive condition in the relative clause is interpreted;
the effect is one of apparent ‘wide-scope’ for the descriptive condition in the
indicative (40a), as opposed to in the subjunctive (40b).
9:22
Cross-linguistic variation in modality systems: The role of mood
The St’át’imcets subjunctive is also not like the Mohawk one. Unlike in
Mohawk, St’át’imcets futures take the indicative, as shown in (43); so do past
habituals, as shown in (44), and plain negatives, as in (45).
9:23
Lisa Matthewson
Finally, there are the cases where the St’át’imcets subjunctive does ap-
pear, with a predictable meaning difference, which are not attested in other
languages. These include the use of the St’át’imcets subjunctive to weaken
an imperative to a polite request, or to help turn a question into a statement
of uncertainty (see examples in (13)–(15) and (16)–(20) above).
I will argue below that in spite of these major empirical differences
between the St’át’imcets subjunctive and that of familiar languages, the basic
framework for mood semantics advanced by Portner (1997) can be adapted
to capture all the St’át’imcets facts. This will support Portner’s proposal
that moods are dependent on modals and place restrictions on the modal
environments in which they appear.
9:24
Cross-linguistic variation in modality systems: The role of mood
(47) For any reference situation r , modal force F , and modal context R,
Jmay dep (φ)Kr ,F ,R is only defined if φ is possible with respect to
Doxα (r ), where α is the denotation of the matrix subject.
When defined, Jmay dep φKr ,F ,R = JφKr ,F ,R (Portner 1997: 201)
Portner further argues that there are actually two mood-indicating may’s,
with slightly different properties. Mood-indicating may under wish, pray,
etc. (as in (46a)) or in unembedded clauses (as in (46c)) has an extra require-
ment: it presupposes that the accessibility relation R is buletic (deals with
somebody’s wishes or desires).
The discussion of mood-indicating may illustrates an important aspect
of Portner’s analysis, namely that moods place presuppositions on the modal
accessibility relation (a type of conversational background). With English
mood-indicating may, there is a doxastic and sometimes a buletic restriction.
For the English mandative subjunctive, which appears in imperatives as well
as in embedded contexts as in (48), R must be deontic, as shown in (49).
(48) Mary demands that you join us downstairs at 3pm. (Portner 1997: 202)
(49) For any reference situation r , modal force F , and modal context R,
Jm-subj(φ)Kr ,F ,R is only defined if R is a deontic accessibility relation.
9:25
Lisa Matthewson
In the analysis to follow, I will adopt Portner’s idea that moods place
restrictions on a governing modal operator. I will argue that the empirical
differences between the St’át’imcets subjunctive and Indo-European sub-
junctives derive from the fact that the former restricts the conversational
background of the modal operator in such a way that the modal force is
weakened.
I deal here only with the constructions where the subjunctive has a semantic
effect; I will not address the cases of obligatory subjunctive agreement which
were presented in subsection 2.2.21 My analysis will account for all meaningful
uses of the St’át’imcets subjunctive except the two uses which contain the
particle t’u7. See Section 8 for some discussion of the t’u7-constructions.
19 Interestingly, the Italian indicative imposes a modal force restriction as well as a conver-
sational background restriction; it is only used with a force of necessity (Portner 1997:
197).
20 According to Giorgi and Pianesi, the subjunctive indicates that the ordering source is non-
empty; this is a restriction on a conversational background.
21 The analysis presented below is actually compatible with the obligatory presence of the
subjunctive in if -clauses introduced by lh=, and may even help to explain why lh= obligatorily
selects the subjunctive when it means ‘if’, but selects indicative when it means ‘before’.
Thanks to Henry Davis for discussion of this point, and see Davis 2006: chapter 26. (See also
van Eijk 1997: 217, although van Eijk analyzes the subjunctive-inducing lh= as distinct from
(e)lh= ‘before’.) As for the other obligatory cases of subjunctive, these may be grammaticized,
semantically bleached relics of original meaningful uses, aided by the fact that subjunctive
marking is intertwined with person agreement.
9:26
Cross-linguistic variation in modality systems: The role of mood
The first thing to establish is that like Portner’s moods, the St’át’imcets
subjunctive does not itself assert a modal semantics, but is dependent on
a governing modal operator. One piece of evidence for this is that the
St’át’imcets subjunctive must co-occur with an overt modal in almost all its
uses. Of the seven uses of the subjunctive being analyzed here, five of them
have an overt modal (the deontics, ‘pretend’, wh-questions, yes-no questions,
ignorance free relatives), one of them is plausibly analyzed as containing a
covert modal (imperatives), and only one is non-modal (plain assertions). As
noted above, the addition of the subjunctive to plain assertions is extremely
restricted and at least semi-conventionalized. If the subjunctive were itself
independently modal, it would be difficult to explain the minimal contrasts
in (50)–(51).22
9:27
Lisa Matthewson
itself contribute modal semantics. For example, (50b) does not mean ‘It
should be the case that the child should sleep’.
The St’át’imcets subjunctive also patterns morphosyntactically like a
mood rather than like real modals in the language. As shown above, the
subjunctive is obligatorily selected by some complementizers, unlike modals.
The subjunctive is also fused with subject marking into a full paradigm, unlike
the modals, which are independent second-position clitics.23 I therefore
conclude that the St’át’imcets subjunctive does not itself introduce a modal
operator, but requires one in its environment.
9:28
Cross-linguistic variation in modality systems: The role of mood
The core idea of my proposal is that the St’át’imcets subjunctive restricts its
governing modal only in such a way as to weaken the force of the proposi-
tion expressed. The intuition that the St’át’imcets subjunctive weakens the
proposition it adds to was already expressed by Davis (2006: chapter 24):
9:29
Lisa Matthewson
lexically encode quantificational force in St’át’imcets, this will mean that the
subjunctive must appear in the scope of a variable-force modal, and will
restrict it to a weakened interpretation.
6 Analysis
9:30
Cross-linguistic variation in modality systems: The role of mood
Rullmann et al. (2008) argue that there are two differences between English
universal modals like must and St’át’imcets modals. First, the St’át’imcets
modals place presuppositions on the conversational backgrounds. Second,
the set of best worlds is further narrowed down by a choice function which
picks out a potentially proper subset of the best worlds to be quantified
over. This can lead to a weaker reading, depending on context. The idea is
illustrated informally in (59).25
9:31
Lisa Matthewson
function from being the identity function.26 This is illustrated informally for
a deontic case in (60).
(60) guy’t=ás=ka ti=sk’úk’wm’it=a
sleep=3subj=deon det=child=exis
‘I hope the child sleeps.’
9:32
Cross-linguistic variation in modality systems: The role of mood
I adopt the following basic definitions from von Fintel & Heim 2007. (61)
shows the ordering of worlds according to how well they satisfy the set of
propositions in the ordering source, and (62) shows how the best worlds are
selected.
(61) Given a set of worlds X and a set of propositions P , define the strict
partial order <P as follows:
∀w1 , w2 ∈X : w1 <P w2 iff {p∈P : p(w2 ) = 1} ⊂ {p∈P : p(w1 ) = 1}
For any worlds w1 and w2 , w1 comes closer to the ideal set up by
the ordering source than w2 does iff the set of propositions in the
ordering source which are true in w2 is a proper subset of the set of
propositions in the ordering source which are true in w1 .
(62) For a given strict partial order <P on worlds, define the selection
function maxP that selects the set of <P -best worlds from any set X
of worlds:
∀X ⊆ W : maxP (X) = {w ∈ X : ¬∃w 0 ∈ X : w 0 <P w}
(von Fintel & Heim 2007: 55)
The best worlds are those for which there are no worlds closer to the ideal
than they are. The analysis of English must is given in (63). must takes as
arguments a modal base, an ordering source and a proposition, and asserts
that in all the best worlds in the modal base, as defined by the ordering
source, the proposition is true.27,28
(63) Jmust Kc,w = λhhs,hst,tii .λghs,hst,tii .λqhs,ti .∀w 0 ∈maxg(w) (∩h(w)) : q(w 0 ) = 1
(von Fintel & Heim 2007: 55)
9:33
Lisa Matthewson
Now for the subjunctive. As shown in (65), the subjunctive does not affect
truth conditions but merely enforces a weaker-than-necessity reading of a
modal in the environment. The subjunctive does not itself introduce any
conversational backgrounds; h and g in (65) are free variables. I assume
that this enforces anaphoricity: the mood must be c-commanded by a modal
which introduces h and g.29
As above, maxg(w) (∩h(w)) picks out the best worlds in the modal base,
as defined by the normative ordering source. The contextually determined
choice function fc picks out a subset of maxg(w) (∩h(w)), and the modal
universally quantifies over the set picked out by the choice function. Be-
cause the subjunctive mood presupposes that there is at least one world
29 Thanks to an anonymous reviewer for pointing out an inconsistency in an earlier version of
(65).
9:34
Cross-linguistic variation in modality systems: The role of mood
≠ ‘It is not the case that [in at least one of the best worlds in the
modal base, he doesn’t go, and in all of the set of worlds selected by
the choice function, he goes].’
i.e, ≠ ‘It is not the case that [it’s good if he goes, and I can still be
happy if he doesn’t].’
Nor can we test projection through ‘if’, as ‘if’-clauses obligatorily and re-
dundantly select the subjunctive in St’át’imcets (see subsection 2.2). However,
questions provide evidence that the subjunctive does not contribute ordinary
asserted content. Recall that the subjunctive plus an inferential evidential
30 Thanks to David Beaver and an anonymous reviewer for asking for clarification of this issue.
9:35
Lisa Matthewson
≠ ‘Is it the case that [in at least one of the best worlds compatible
with the inferential evidence, Lémya7 is not the chief, and in all of the
set of worlds selected by the choice function, Lémya7 is the chief]?’
i.e, ≠ ‘Is it the case that [Lémya7 is possibly but not necessarily the
chief]?’
Further evidence that the subjunctive does not contribute ordinary as-
serted content comes from the impossibility of directly affirming or denying
its contribution. This is shown in (69), where B and B’ try to deny A’s sub-
junctive claim that in at least one world compatible with A’s knowledge and
desires, the children don’t sleep. The consultant absolutely rejects the replies
in B and B’.
9:36
Cross-linguistic variation in modality systems: The role of mood
(70) After using the bathroom, everybody ought to wash their hands;
employees have to.
(von Fintel & Iatridou 2008: 116)
(71) also illustrates the contrast between the different modal strengths.
In (71a), taking Route 2 is the only option, if you want to get to Ashfield: all
the worlds in which you get to Ashfield are Route 2-worlds. In (71b), there
are other getting-to-Ashfield worlds apart from only Route 2-worlds. But the
Route-2 worlds are the best, taking into consideration some other factors
(such as a scenic route).
31 For example, attempts to elicit ‘Hey, wait a minute!’ responses to presupposition failures for
a wide range of standard presupposition triggers have all failed (Matthewson 2006, 2008b).
We are therefore unable to decide the presupposition issue for the subjunctive by using the
‘Hey, wait a minute!’ test (as was suggested by an anonymous reviewer).
9:37
Lisa Matthewson
von Fintel and Iatridou argue that ought is a weak necessity modal, and
that weak necessity modals signal the existence of a secondary ordering
source. This is illustrated informally in (72)–(73). (72) contains a strong
necessity modal, and gives a strong reading, as usual. In (73), a secondary
ordering source further restricts the set of worlds which are universally
quantified over, leading to a weaker reading.
As von Fintel & Iatridou (2008: 137) put it: ‘The idea is that saying that
to go to Ashfield you ought to take Route 2, because it’s the most scenic
way, is the same as saying that to go to Ashfield in the most scenic way,
you have to take Route 2.’ This is very parallel in spirit to Rullmann et al.’s
(2008) analysis of St’át’imcets modals, where a weak reading is obtained by
a universal quantifier with a restriction provided by a choice function. And
just like Rullmann et al.’s analysis, von Fintel and Iatridou’s actually predicts
gradience: how ‘weak’ a weak necessity modal is can vary, depending on
9:38
Cross-linguistic variation in modality systems: The role of mood
which secondary ordering source you pick. In fact, given that the motivation
for using a choice function rather than an ordering source was unconvincing
anyway (cf. Kratzer 2009, Peterson 2009, 2010, and Portner 2009), the
Rullmann et al.-style analysis is better implemented using a double ordering
source, exactly as in von Fintel & Iatridou 2008.32
So what is the difference between English and St’át’imcets? Simply that in
English, we lexically encode the weak necessity (ought vs. have to/must). In
St’át’imcets, no differences in modal force are lexically encoded by modals,
but what English modals do, St’át’imcets does via mood. Another way of
describing the analysis offered here would be to say that the St’át’imcets
subjunctive enforces weak necessity (via domain restriction): it forces there
to be two (non-vacuous) restrictions on the set of worlds in the modal base.
While further cross-linguistic investigation goes beyond the scope of this
paper, it is worth pointing out a connection to another intriguing observation
of von Fintel and Iatridou’s, namely that in many languages, weak necessity
modals are created transparently from a strong necessity modal plus coun-
terfactual morphology. This is illustrated in (74) for French, where the modal
appears in the conditional mood, the one which occurs in counterfactual
conditionals.
(74) tout le monde devrait se laver les mains mais les serveurs
everybody must/cond refl wash the hands but the waiters
sont obligés
are obliged
‘Everybody ought to wash their hands but the waiters have to.’
(von Fintel & Iatridou 2008: 121)
9:39
Lisa Matthewson
necessity interpretations, and may offer a potential new avenue for looking
at languages like French.
7.1 Imperatives
Recall that the subjunctive, when added to an imperative, makes the com-
mand more polite. An example is repeated here:
9:40
Cross-linguistic variation in modality systems: The role of mood
serving as the modal base, and a contextually given set of preferences giv-
ing the ordering source. In addition, imperatives carry presuppositions, as
shown in (76). The presuppositions restrict an imperative to situations where
a performative use of a deontic modal would be possible, namely those in
which the speaker is an authority on the matter.34
Modal base: What the speaker and hearer jointly take to be possible
Ordering source: The speaker’s commands
(77) is true iff all worlds in the Common Ground that make true as much as
possible of what the speaker commands at the world and time of utterance
make it true that the addressee gets up within a certain event frame t
(Schwager 2005: chapter 6). The difference between (77) and the plain modal
statement ‘You must get up’ is that with the imperative, the speaker is
presupposed to be an authority. This has the consequence that whenever an
imperative is defined, it is necessarily true.
Adopting Schwager’s analysis enables us to treat the St’át’imcets sub-
junctive imperatives the same way we treated the weakened normative ka-
statements above. We have to assume that the deontic modal in a St’át’imcets
imperative is, like the overt ka, a universal modal which introduces a choice
function or secondary ordering source. While a normal imperative roughly
says that in all the best worlds (the worlds where you obey my commands),
34 The descriptive vs. performative use of a deontic modal is shown in (i), from (Schwager
2008: 26).
(i) a. Peter may come tomorrow. (The hostess said it was no problem.) descriptive
b. Okay, you may come at 11. (Are you content now?) performative
35 The preferences may relate to the addressee’s wishes, as in the case of advice or suggestions.
9:41
Lisa Matthewson
Given what we know the world to be like and given what you want, it
is necessary that you take an apple. (cf. Schwager 2008: 49)
(79) Context: Your friend comes over and is visiting with you. You hear
her stomach rumbling. You give her a plate and say ‘Have some cake!’
a. wá7=malh kiks-tsín-em
be=adhort cake-eat-mid
‘Have some cake!’
b. #wá7=acw=malh kiks-tsín-em
be=2sg.sbjn=adhort cake-eat-mid
‘You may as well have some cake.’
36 (79b) is marked as infelicitous in this context, which is how the consultant judges it. (80b)
appears to be ungrammatical. The difference possibly relates to the presence in (79b) of the
adhortative particle malh, an interesting element whose analysis must await future research.
37 An anonymous reviewer points out that permission imperatives should be able to take the
subjunctive in certain circumstances, meaning something like ‘the very best way to achieve
your desires is p, though there are other ways’. Future research is required to see whether
this prediction is upheld once the right discourse contexts are provided.
9:42
Cross-linguistic variation in modality systems: The role of mood
(80) Context: You are at a gathering and they are almost running out of
food. You take the last piece of fish and then you see an elder is
behind you and is looking disappointed and has no fish on her plate.
You say ‘Take mine!’
a. kwan ts7a ti=n-tsúw7=a
take(dir) deic det=1sg.poss-own=exis
‘Take mine!’
b. *kwán=acw ts7a ti=n-tsúw7=a
take(dir)=2sg.sbjn deic det=1sg.poss-own=exis
intended: ‘Take mine!’
9:43
Lisa Matthewson
7.2 Questions
9:44
Cross-linguistic variation in modality systems: The role of mood
b. *t’íq=as=ha k=Bill
arrive=3sbjn=ynq det=Bill sbjn
c. ?t’íq=ha=k’a k=Bill
arrive=ynq=infer det=Bill
‘I wonder if Bill arrived.’ evid + indic
d. t’iq=as=há=k’a k=Bill
arrive=3sbjn=ynq=infer det=Bill
‘I wonder if Bill arrived.’ evid + sbjn
(85) a. ínwat=wit
say.what=3pl
‘What did they say?’ indic
b. *inwat=wít=as
say.what=3pl=3sbjn sbjn
c. ??inwat=wít=k’a
say.what=3pl=infer
‘I wonder what they said.’ evid + indic
d. inwat=wít=as=k’a
say.what=3pl=3sbjn=infer
‘I wonder what they said.’ evid + sbjn
9:46
Cross-linguistic variation in modality systems: The role of mood
(88) Did I tell you it would be easy? ≈ I didn’t tell you it would be easy.
But this is not the meaning we get in St’át’imcets for conjectural questions.
In order to express a true rhetorical question, St’át’imcets speakers use
something which is string-identical to an ordinary question, just as in English.
This is illustrated in (89)–(90). (90b) shows that adding a subjunctive plus an
evidential to a rhetorical question results in rejection of the utterance.
(89) Context: Your daughter is complaining that learning how to cut fish
is hard. You say:
a. tsun-tsi=lhkán=ha k=wa=s lil’q
say(dir)-2sg.obj=1sg.indic=ynq det=impf=3poss easy
‘Did I tell you it would be easy?’
41 See Rocci 2007: 147 for the same claim for an Italian construction with similar semantics to
St’át’imcets conjectural questions.
9:47
Lisa Matthewson
(90) Context: You are at the PNE (a fair) and there is this very scary ride
which looks really dangerous. Your friend asks you if you are going
to go on it. You say:
a. tsut-anwas=kácw=ha kw=en=klíisi
say-inside=2sg.indic=ynq det=1sg.poss=crazy
‘Do you think I’m crazy?’
b. *tsut-anwas=ácw=ha=k’a kw=en=klíisi
say-inside=2sg.sbjn=ynq=infer det=1sg.poss=crazy
‘Do you think I’m crazy?’
The status of speaker and addressee knowledge also differs between rhetori-
cal questions and conjectural questions. In rhetorical questions, the speaker
knows the true answer to the question, and typically assumes that the hearer
does as well (e.g., Caponigro & Sprouse 2007). Subjunctive questions are the
exact opposite: neither the speaker nor the addressee typically knows the
answer.
In the remainder of this section I will first present the analysis of conjec-
tural questions which contain evidentials, and then explain an interesting
difference between the evidential and the future with respect to subjunctive
licensing.
First, we need an analysis of questions. I adopt a fairly standard approach,
according to which a question denotes a set of propositions, each of which
is a (partial, true or false) answer to the question (Hamblin 1973).42 This is
illustrated in (91)–(92).
(91) Jdoes Hotze smokeKw = {that Hotze smokes, that Hotze does not
smoke}
(92) Jwho left me this fishKw = {that Ryan left me this fish, that Meagan
left me this fish, that Ileana left me this fish,...} = {p : ∃x[p = that x
left me this fish]}
42 As far as I am aware, this choice is not critical and a different approach to questions would
work just as well.
9:48
Cross-linguistic variation in modality systems: The role of mood
I assume that the evidential modal scopes under the question operator,
so that each proposition in the question denotation contains the evidential.
A conjectural question thus bears some similarity to an English question
containing a possibility modal (e.g., ‘Could Bill have (possibly) arrived?’), with
the additional factor that the evidential introduces a presupposition about
evidence source. Following Guerzoni (2003), I assume that when a question
contains a presupposition trigger, each proposition in the alternative set
carries the relevant presupposition. The question therefore denotes a set of
alternative partial propositions. This is illustrated in (94).43
9:49
Lisa Matthewson
9:50
Cross-linguistic variation in modality systems: The role of mood
believe the question is easily answerable, and this lets the hearer off the hook
with respect to providing an answer.44
However, there are various problems with this analysis, as pointed out
by Littell (2009). One is that the evidence presuppositions are not always
contradictory. For example, a conjectural question such as ‘Who likes ice
cream?’ would presuppose for each contextually salient individual x that
there is inferential evidence that x likes ice cream. But it is perfectly possible
that everyone likes ice cream, and the evidence presuppositions in this case
do not rule out the possibility that the hearer knows the true answer. A
second problem is seemingly incorrect predictions about questions which
contain other evidentials, such as reportative or direct evidentials. Littell
argues that an analysis of conjectural questions which relies on conjoined
evidence presuppositions should predict reduced interrogative force for
any evidential question — yet cross-linguistically it is overwhelmingly only
inferential or conjectural evidentials which result in reduced interrogative
force. This is certainly true of St’át’imcets, as shown in the minimal pair in
(97).45
(i) p is not in the Common Ground and ¬p is not in the Common Ground
(iii) There is some set of facts E in CG, such that E is non-conclusive evidence in favor
of p
These are very similar to the effects of the St’át’imcets conjectural questions. However, Rocci
does not give a compositional analysis, perhaps partly because the che-subjunctives have no
overt evidentials or epistemic modals in the structure.
45 Cheyenne is an exception; reportatives in questions in Cheyenne allow non-interrogative
readings under certain circumstances (Murray to appear).
9:51
Lisa Matthewson
9:52
Cross-linguistic variation in modality systems: The role of mood
According to Littell (2009), this analysis accounts for the reduced inter-
rogative force of conjectural questions. The idea is that inferential evidence
is a fairly weak type of evidence, and a speaker who asks a question while
implicating that the hearer only has inferential evidence about the true an-
swer is letting the hearer off the hook with respect to answering. This is
intended to account for (a) the judgments of St’át’imcets consultants that
conjectural questions do not require an answer, (b) the fact that conjectural
questions are infelicitous when the addressee is likely to know the answer (cf.
(87)), and (c) the fact that conjectural questions are translated as ‘I wonder’
or ‘maybe’-statements (although they do not literally have the semantics
of ‘wonder’). ‘I wonder’ is simply a typical method in English of raising a
question without demanding an answer.
However, this account does not seem to predict a complete absence of
interrogative force. After all, the inferential evidence the hearer is assumed
to possess is better than no evidence at all. In line with this, an English
question like ‘According to the weak evidence you have, could Hotze smoke?’
still functions pragmatically as an interrogative. I conclude, therefore, that
interrogative flip plus implicatures about the absence of stronger evidence are
not sufficient in and of themselves to completely let the hearer off the hook
with respect to answering. This is actually a welcome result, since questions
containing k’a in the indicative mood are sometimes translated by speakers
into English using ordinary questions (rather than as statements of doubt;
see footnote 40). However, conjectural questions containing the subjunctive
are never translated as ordinary questions. I therefore assume that while a
question containing an evidential is already somewhat ‘weakened’ in terms
of its interrogative force, the subjunctive performs a further weakening. The
task now is to see whether this falls out from the analysis of the subjunctive
proposed above.
Recall that in the context of a governing modal, the subjunctive adds the
presupposition that in at least one of the best worlds in the modal base, the
proposition is false. The best worlds here (as the modal is epistemic) are
those which conform to the propositions known to be true, and in which
things happen as normal. Since the evidential has undergone interrogative
flip, the epistemically accessible worlds must also be flipped to be the worlds
9:53
Lisa Matthewson
compatible with the hearer’s knowledge. The results are shown in (100).47
As before, the implicature that the hearer does not have strong evidence
about the true answer, combined with the mixed-evidence effect of the
evidential presuppositions, will partially reduce the expectation that the
hearer is able to answer the question. In addition, thanks to the subjunctive,
the question now presupposes not only that the evidence about Bill’s possible
arrival is mixed, but also that there are worlds compatible with the hearer’s
knowledge in which Bill does come, and worlds compatible with the hearer’s
knowledge in which he does not come. In other words, the hearer does not
know whether he will come or not. The result is that a subjunctive conjectural
question has a significantly reduced expectation on the hearer to provide an
answer.48
The account just given, which incorporates the analysis of the St’át’imcets
subjunctive as weakening a modal proposition via domain restriction, suc-
47 An anonymous reviewer raises a potentially significant issue with the choice function
required for these cases. With the deontic and imperative cases discussed above, the choice
function had intuitive content (e.g., the ‘very best way to achieve some end’), but here the role
of the subjunctive is purely to make sure there are some ‘best worlds’ where the prejacent is
false. It is thus not clear which proper subset of the best worlds the function picks out.
48 As noted above, conjectural questions also imply that the speaker does not know the answer.
I assume that this follows, by Gricean reasoning, from the fact that the speaker uttered a
question, rather than having simply asserted the true answer. However, there is a bit more to
be said here, since plain questions in St’át’imcets allow a ‘display question’ use — a teacher
can ask (i):
9:54
Cross-linguistic variation in modality systems: The role of mood
As an anonymous reviewer points out, this display use should technically remain even when
the subjunctive is added. However, consultants judge the subjunctive version of (i) to no
longer be a teacher’s question, but a student’s reply:
Perhaps conjectural questions like (ii) simply do not make good questions for a teacher to
ask because they encode addressee ignorance.
9:55
Lisa Matthewson
(102) a. inwat=wít=kelh
say.what=3pl=fut
‘What will they say?’ fut + indic
b. inwat=wít=as=kelh
say.what=3pl=3sbjn=fut
‘I wonder what they will say.’ fut + sbjn
The contrast between the evidential and the future with respect to whether
the subjunctive is required to create a conjectural question is striking. So
far, I have argued that the evidential k’a contributes to reduced interrogative
force by means of an implicature that the hearer has no better than inferential
evidence for the true answer, and that the subjunctive contributes to further
reduced interrogative force by presupposing that it is compatible with the
hearer’s knowledge state that each possible answer is false. Now unlike k’a,
the future modal kelh has not been analyzed as an epistemic modal, and it
does not introduce any evidence presuppositions. The denotation for kelh is
given in (103).
If defined, Jkelh(h)(g)Kc,w,t =
λqhs,hi,tii .∀w 0 ∈ fc (maxg(w) (∩h(w, t)))[∃t 0 [t < t 0 ∧ q(w 0 )(t 0 ) = 1]]
9:56
Cross-linguistic variation in modality systems: The role of mood
c. Presuppositions of (104a):
The future claim is made on the basis of the facts; Gloria won’t
go home in at least one stereotypical world compatible with the
facts, Gloria will not go to her mother’s house in at least one
stereotypical world compatible with the facts, . . .
There are no implicatures about evidence types this time, but interestingly,
we still predict reduced interrogative force. And this time, the contribution
of the subjunctive is absolutely critical to deriving the effect. Due to the
subjunctive, the question as a whole presupposes for each contextually
salient place that Gloria might go, that there is at least one stereotypical
world compatible with the facts in which she doesn’t go there. This means
that the facts underdetermine where she might go — and thus, that the
addressee may not know where she will go. Given that the subjunctive is
crucial in deriving the reduced interrogative force, we correctly predict that
the subjunctive is obligatory in conjectural questions like (102).
9:57
Lisa Matthewson
With ignorance free relatives, the modal base F is the epistemic alterna-
tives of the speaker.51 Consider (107), for example.
9:58
Cross-linguistic variation in modality systems: The role of mood
speaker ignorance about the denotation of the free relative actually derives
from the evidential k’a and the subjunctive.
The basic idea is that an ignorance free relative is formed from a conjec-
tural question (see Davis 2009 for this insight, although Davis does not word
it in this way). The free relative in (105a), for example, is formed from the
conjectural question in (108).
9:59
Lisa Matthewson
7.4 ‘Pretend’
There are two patterns to account for with the ‘pretend’ cases, depending on
the dialect. In Upper St’át’imcets, the subjunctive plus the normative modal
ka frequently renders a ‘pretend to be ...’ interpretation. In Whitley et al. no
date, a native-speaker-produced St’át’imcets teaching manual, the standard
construction when the teacher is asking the students to pretend something
is that in (109).
In Lower St’át’imcets, however, examples like the ones in (109) are rejected
in ‘pretend’ contexts. Lower St’át’imcets uses either an emphatic pronoun
in a cleft, as in (110a), or the adhortative particle malh, as in (110b). In each
case, the subjunctive is present, but ka is absent.
9:60
Cross-linguistic variation in modality systems: The role of mood
translates this into English as ‘You may as well be an owl’. The presence
of adhortative malh here is a matter for future research; see comments in
Section 8 below.
Support for the idea that (109) and (110) are not really ‘pretend’ construc-
tions comes from the fact that exactly parallel structures are used when the
wish is not that someone pretend to be something, but rather is a wish which
has a chance of coming true. This is shown in (111). While the consultant
accepts a ‘pretend’ translation for the sentences in (111), she spontaneously
translates them into English using simply ‘you be . . . ’. She judges that the
St’át’imcets sentences do not really mean ‘pretend’.
(112) JbelieveKw,g =
λphs,ti .λx.∀w 0 compatible with what x believes in w : p(w 0 ) = 1
(von Fintel & Heim 2007: 18)
9:61
Lisa Matthewson
There is no reason to assume that attitude verbs like ‘believe’ have different
semantics in St’át’imcets from in English. On the contrary, the St’át’imcets
verb tsutánwas ‘think, believe’ must involve universal quantification over
belief-worlds, without the possibility of domain restriction (in other words,
there is no choice function or second ordering source). Thus, (113), just like
its English gloss, requires that in all Laura’s belief-worlds, John has left. It
cannot mean that Laura’s beliefs allow, but do not require, that John has left.
Given this, adding the subjunctive under the verb ‘believe’ in St’át’imcets
leads to the following contradictory result.
9:62
Cross-linguistic variation in modality systems: The role of mood
193). Here I adopt Portner’s (1997) analysis of desire verbs, and in particular
we will see that the St’át’imcets verb xát’min’ is better analyzed as similar to
English hope (which according to Portner is similar to believe, and therefore
is not intrinsically comparative) than to English want.
Portner analyzes hope in terms of a buletic accessibility relation Bulα (s, b).
For any situation s and belief situation b of an agent α, Bulα (s, b) is the set
of buletic alternatives for α in s — i.e., ‘the worlds in which the most of α’s
plans in s (relative to his or her beliefs in b) are carried out’ (Portner 1997:
178). The sentence in (116) receives the interpretation shown: it is true just in
case in all of James’s buletic alternatives, Joan arrives in Richmond soon.
Portner’s analysis of hope differs from that of want, and is parallel to that
of believe, in crucial respects (which explain the different embedding possi-
bilities for hope/believe vs. want). In particular, while hope and believe are
defined directly in terms of (doxastic or buletic) alternatives, want is defined
in terms of the agent’s plans. Portner argues that the difference between
hope and want is ‘an idiosyncratic lexical one’ (Portner 1997: 189). If this is
correct, it would not be unexpected that a language could contain only the
hope-type of desire predicate.
If we apply Portner’s analysis of hope to St’át’imcets xát’min’, and attempt
to use the subjunctive in the embedded clause, we get the result in (117).
J(117)Ks is only defined if ∃s ∈ BulLaura (s, b): John does not come in s
(117) is defined only if there is at least one situation in Laura’s buletic alter-
natives in which John does not come, but it asserts that in all Laura’s buletic
alternatives, John comes. The contradiction between the presupposition and
the assertion leads to the unacceptability of the sentence.
9:63
Lisa Matthewson
The goal of this paper was to extend the formal cross-linguistic study of
modality to the related domain of mood. Prior work on St’át’imcets has
proposed that languages vary in whether their modals encode quantifica-
tional force (as in English), or conversational background (as in St’át’imcets)
(Matthewson et al. 2007, Rullmann et al. 2008, Davis et al. 2009). Here, I have
argued that languages vary in their mood systems along the same dimension,
at least functionally. While some languages use moods to encode distinctions
of conversational background (buletic, deontic, etc.), St’át’imcets uses mood
to functionally achieve a restriction on modal quantificational force. (Of
course technically, both modals and moods in St’át’imcets restrict conver-
sational backgrounds: the modal force is always universal.) If this view is
correct, then each language-type draws on its moods and its modals together
to allow the full range of specifications. In other words, what modals don’t
encode, moods do. The simplified typological table is repeated here.
The analysis presented here raises some questions for future research.
One outstanding issue is the status of subjunctives with no overt licenser at
54 Thanks to an anonymous reviewer for discussion of this point.
9:64
Cross-linguistic variation in modality systems: The role of mood
9:65
Lisa Matthewson
As noted above, t’u7 is present in the ‘might as well’ uses of the subjunc-
tive, and in indifference free relatives. Examples are repeated here.
(121) To get good cheese, you only have to go to the North End!
(von Fintel & Iatridou 2008: 445)
9:66
Cross-linguistic variation in modality systems: The role of mood
As for indifference free relatives as in (120), these also very plausibly con-
tain a covert modal, presumably a necessity one. The important question will
be whether the subjunctive can be analyzed as a weakener in the indifference
free relatives. Ideally, the future analysis of (119)–(120) will also elucidate
the semantic connection between the two t’u7-subjunctives, both of which
somehow express the notion of ‘indifference’ (although perhaps in different
senses of the word). (119b), for example, conveys that you can stay here for
the night or not, I don’t really care.
In spite of these outstanding questions, I believe that the empirical cover-
age of the analysis presented here is encouraging. Out of the nine meaningful
uses of the St’át’imcets subjunctive, we set aside two which rely on the poorly-
understood particle t’u7, but have managed to unify the remaining seven.
The analysis accounts for such seemingly disparate effects as the weakening
of imperatives, the reduction in interrogative force of questions, and the
non-appearance of the subjunctive under any attitude verb. The analysis, if
correct, supports the modal approach to mood advocated by Portner (1997),
and suggests that languages have a certain amount of freedom in how they
divide up the various functional tasks required of moods and modals.
Finally, the research reported on here opens up broader questions about
the nature of mood cross-linguistically, for example about the relation be-
tween subjunctive and irrealis. In Section 2, I showed that the St’át’imcets
subjunctive patterns morpho-syntactically, as well as in some of its semantic
properties, like a subjunctive rather than an irrealis. However, we also saw
that the St’át’imcets subjunctive differs semantically from Indo-European
subjunctives. I argued above (see fn. 9) that the use of the term ‘subjunctive’
was justified, even in the face of such non-trivial cross-linguistic variation.
However, there is much more work to be done on the formal semantics of
mood cross-linguistically. Once a wider range of systems are investigated
in depth, we may find that the traditional terminology does not correlate
with the cross-linguistically interesting divisions. Topics for future inquiry
include whether there is a minimal semantic change which would turn a
subjunctive morpheme into an irrealis one, or vice versa, and in general what
the semantic building blocks are from which moods are composed.
9:67
Lisa Matthewson
References
9:68
Cross-linguistic variation in modality systems: The role of mood
van Eijk, Jan & Lorna Williams. 1981. Lillooet legends and stories. Mt. Currie,
BC: Ts’zil Publishing House.
Faller, Martina. 2002. Semantics and pragmatics of evidentials in Cuzco
Quechua: Stanford dissertation.
Faller, Martina. 2006. Evidentiality and epistemic modality at the se-
mantics/pragmatics interface. http://www.eecs.umich.edu/~rthomaso/
lpw06/fallerpaper.pdf.
Farkas, Donka. 1992. On the semantics of subjunctive complements. In Paul
Hirschbühler & Konrad Koerner (eds.), Romance languages and modern
linguistic theory: Papers from the 20th linguistic symposium on Romance
languages, 69–104. Amsterdam and Philadelphia: Benjamins.
Farkas, Donka. 2003. Assertion, belief and mood choice. Paper presented at
the Workshop on Conditional and Unconditional Modality, ESSLLI, Vienna.
http://people.ucsc.edu/~farkas/papers/mood.pdf.
von Fintel, Kai. 2000. Whatever. In Proceedings of SALT X, 27–40. http:
//web.mit.edu/fintel/www/whatever.pdf.
von Fintel, Kai & Anthony Gillies. 2010. Must . . . stay . . . strong! Natural
Language Semantics. doi:10.1007/s11050-010-9058-2.
von Fintel, Kai & Irene Heim. 2007. Intensional semantics lecture notes. Ms.,
MIT. http://mit.edu/fintel/IntensionalSemantics.pdf.
von Fintel, Kai & Sabine Iatridou. 2008. How to say ought in foreign: The
composition of weak necessity modals. In Jacqueline Guéron & Jacqueline
Lecarme (eds.), Time and modality, 115–141. Dordrecht: Springer. http:
//mit.edu/fintel/fintel-iatridou-2006-ought.pdf.
Garrett, Edward. 2001. Evidentiality and assertion in Tibetan. Los Angeles,
CA: UCLA dissertation.
Gauker, Christopher. 1998. What is a context of utterance? Philosophical
Studies 91(2). 149–172. doi:10.1023/A:1004247202476.
Giannakidou, Anastasia. 1997. The landscape of polarity items. Groningen:
University of Groningen dissertation.
Giannakidou, Anastasia. 1998. Polarity sensitivity as (non)veridical depen-
dency. Amsterdam and Philadelphia: John Benjamins.
Giannakidou, Anastasia. 2009. The dependency of the subjunctive re-
visited: Temporal semantics and polarity. Lingua 119(12). 1883–1908.
doi:10.1016/j.lingua.2008.11.007.
Giorgi, Alessandra & Fabio Pianesi. 1997. Tense and aspect: From semantics
to morpho-syntax. Oxford: Oxford University Press.
Guerzoni, Elena. 2003. Why ‘even’ ask? on the pragmatics of questions and
9:69
Lisa Matthewson
9:70
Cross-linguistic variation in modality systems: The role of mood
9:71
Lisa Matthewson
Mitchell, Keith. 2003. Had better and might as well: On the margins of modal-
ity? In M. Krug R. Facchinetti & F. Palmer (eds.), Modality in contemporary
english, 129–149. Berlin: Mouton de Gruyter.
Murray, Sarah. to appear. Evidentiality and questions in Cheyenne. In Suzi
Lima (ed.), Proceedings of SULA 5: Semantics of under-represented lan-
guages in the Americas, Amherst, MA: GLSA Publications.
Palmer, Frank. 2006. Mood and modality. Cambridge: Cambridge University
Press 2nd edn. doi:10.2277/0521804795.
Panzeri, Francesca. 2003. In the (indicative or subjunctive) mood. In Pro-
ceedings of Sinn und Bedeutung 7, http://ling.uni-konstanz.de/pages/
conferences/sub7/proceedings/download/sub7_panzeri.pdf.
Peterson, Tyler. 2009. The ordering source and graded modality in Gitskan
epistemic modals. Ms., University of British Columbia. http://www.
linguistics.ubc.ca/sites/default/files/Peterson(SuB).pdf.
Peterson, Tyler. 2010. Epistemic modality and evidentiality in Gitksan at the
semantics-pragmatics interface: University of British Columbia disserta-
tion. http://hdl.handle.net/2429/23596.
Portner, Paul. 1997. The semantics of mood, complementation and
conversational force. Natural Language Semantics 5(2). 167–212.
doi:10.1023/A:1008280630142.
Portner, Paul. 2003. The semantics of mood. In Lisa Cheng & Rint Sybesma
(eds.), The second Glot international state-of-the-article book, 47–77. Berlin:
Mouton de Gruyter.
Portner, Paul. 2004. The semantics of imperatives within a theory of clause
types. In Proceedings of SALT XIV, Cornell University: CLC Publications.
http://semanticsarchive.net/Archive/mJlZGQ4N/PortnerSALT04.pdf.
Portner, Paul. 2007. Imperatives and modals. Natural Language Semantics
15(4). 351–383. doi:10.1007/s11050-007-9022-y.
Portner, Paul. 2009. Modality Oxford Surverys in Semantics and Pragmatics.
Oxford: Oxford University Press.
Potts, Christopher. 2005. The logic of conventional implicatures. Oxford:
Oxford University Press.
Quer, Josep. 1998. Mood at the interface. The Hague: Holland Academic
Graphics.
Quer, Josep. 2001. Interpreting mood. Probus 13(1). 81–111.
doi:10.1515/prbs.13.1.81.
Quer, Josep. 2009. Twists of mood: The distribution and interpre-
tation of indicative and subjunctive. Lingua 119(12). 1779–1787.
9:72
Cross-linguistic variation in modality systems: The role of mood
doi:10.1016/j.lingua.2008.12.003.
Rivero, María. 1975. Referential properties of Spanish noun phrases. Language
51(1). 32–48. doi:10.2307/413149.
Rocci, Andrea. 2007. Epistemic modality and questions in dialogue. the
case of Italian interrogative constructions in the subjunctive mood. In
L. de Saussure, J. Moeschler & G. Puska (eds.), Tense, mood and aspect:
Theoretical and descriptive issues, 129–153. Amsterdam and New York:
Rodopi.
Rullmann, Hotze, Lisa Matthewson & Henry Davis. 2008. Modals as
distributive indefinites. Natural Language Semantics 16(4). 317–357.
doi:10.1007/s11050-008-9036-0.
Schwager, Magdalena. 2005. Interpreting imperatives: University of Frank-
furt/Main dissertation.
Schwager, Magdalena. 2006. Conditionalized imperatives. In Proceedings of
SALT XVI, Cornell University: CLC Publications. http://ecommons.library.
cornell.edu/bitstream/1813/7591/1/salt16_schwager_241_258.pdf.
Schwager, Magdalena. 2008. Optimizing the future - imperatives between
form and function. Course notes, ESLLI 2008. http://zis.uni-goettingen.
de/mschwager/esslli08/ms_schwager_esslli08.pdf.
Stalnaker, Robert. 1974. Pragmatic presuppositions. In Milton Munitz & Peter
Unger (eds.), Semantics and Philosophy, 197–214. New York University
Press.
Stalnaker, Robert. 1984. Inquiry. Cambridge, MA: MIT Press.
Tenny, Carol. 2006. Evidentiality, experiencers and the syntax of sen-
tience in Japanese. Journal of East Asian Linguistics 15(3). 245–288.
doi:10.1007/s10831-006-0002-x.
Tenny, Carol & Peggy Speas. 2004. The interaction of clausal syntax, discourse
roles and information structure in questions. Paper presented at the Work-
shop on Syntax, Semantics and Pragmatics of Questions. ESLLI, Université
Henri Poincaré, Nancy. http://www.linguist.org/ESSLI-Questions-hd.pdf.
Terrell, Tracy & Joan Hooper. 1974. A semantically based analysis of mood in
Spanish. Hispania 57(3). 484–494. doi:10.2307/339187.
Thoma, Sonja. 2007. The categorical status of independent pronouns in
St’át’imcets. Ms., University of British Columbia.
Villalta, Elisabeth. 2009. Mood and gradability: an investigation of the
subjunctive mood in Spanish. Linguistics and Philosophy 31(4). 467–522.
doi:10.1007/s10988-008-9046-x.
Whitley, Rose (translator), Henry Davis, Lisa Matthewson & Beveley Frank
9:73
Lisa Matthewson
Lisa Matthewson
UBC Department of Linguistics
Totem Field Studios
2613 West Mall
Vancouver, BC, V6T 1Z4, Canada
lisamatt@interchange.ubc.ca
9:74
Semantics & Pragmatics Volume 3, Article 10: 1–38, 2010
doi: 10.3765/sp.3.10
Chris Barker
New York University
∗ Thanks to Simon Charlow, Emmanuel Chemla, Cleo Condoravdi, Judith Degen, Nicholas
Fleisher, Sven Lauer, Koji Mineshima, Paul Portner, Daniel Rothschild, Philippe Schlenker,
Chung-chieh Shan, Seth Yalcin, and my anonymous referees.
Since Ross 1941, it has been clear that the logic of obligation and permission
behaves dramatically differently than other sorts of ordinary reasoning:
If (1a) is true, then it is certainly true that you may eat an apple. Likewise, it
is equally true that you have it within your power to safely eat a pear. So an
adequate account of the meaning of (1a) must explain how it comes to imply
(1b) and (1c).
This pattern is by no means the usual case. Consider a variation on (1) in
which the permissive modal may is omitted:
In this case, (2a) certainly does not imply either (2b) or (2c). So something
about permission talk correlates with the unusual implications we are con-
cerned with here.
The puzzle posed by the facts in (1) is known as the free choice permission
problem (Kamp (1973) attributes the choice of name to von Wright).
Since (1a) implies both (1b) and (1c), (1b) and (1c) are therefore both equally
true. Thus in many discussions, (1a) is said to imply (3a), since (3a) is merely
the conjunction of (1b) and (1c):
(3) a. You may eat an apple and you may (also) eat a pear.
b. You may eat an apple or you may (*also) eat a pear.
10:2
Free choice permission as resource-sensitive reasoning
fruit. This is why also is never appropriate in the second disjunct in (3b) on
the intended reading.
What I am suggesting is that a complete characterization of permission
sentences must not only tell us whether permission exists and what type of
permission it is (i.e., permission to eat an apple versus permission to eat a
pear), it must also characterize how much permission has been granted. Thus
it must predict that (1a) and (3b) guarantee permission only to eat one piece
of fruit, but that (3a) can be used to provide permission to eat two pieces of
fruit.
The key insight that I would like to develop in this paper first appears,
as far as I know, in unpublished work of Lokhorst (1997): that permis-
sion and obligation is a resource-sensitive domain, so that logics based on
(resource-insensitive) classical logic are not appropriate. Lokhorst suggests
using Girard’s (1987) Linear Logic instead, and I will follow the technical
details of his proposal closely. The contribution of this paper will be to
introduce Lokhorst’s work to a linguistic audience, to evaluate it with respect
to competing linguistic analyses, and to investigate the implications of adapt-
ing Lokhorst’s proposal for the theory of natural language semantics and
pragmatics.
Resource-sensitive (‘substructural’) logics are already familiar in linguis-
tics as tools for building syntax/semantics interfaces (e.g., Moortgat 1997
or Dalrymple 2001). As far as I know, however, no one has yet suggested
that natural language connectives such as or or and can have uses in which
they behave semantically like connectives in a substructural logic, as I am
suggesting here.
Kamp (1973, 1978) discusses free choice permission not just as a puzzle
for modeling reasoning about obligation (deontic logic), but as a puzzle
for the composition of natural language expressions. From the point of
view of natural language semantics, the interesting thing about the free
choice permission problem is that it appears to require not only making
assumptions about the meaning of certain uses of modal expressions such as
may, but about the meaning of the corresponding uses of the coordinating
conjunctions and and or. This will be true of the solution I offer below.
Many solutions to the free choice permission problem rely on pragmatic
mechanisms for much of the heavy lifting, including Kamp 1978, Zimmer-
mann 2000, Fox 2007, and others. The arguments that free choice implica-
tions are pragmatic, and more specifically are scalar implicatures, stem from
discussions of indefinites in Kratzer and Shimoyama 2002, as developed by
10:3
Chris Barker
Alonso-Ovalle (2006) and Fox (2007). The main evidence that free choice
implications may be scalar implicatures turns on the behavior of negated
permission sentences (You may not eat an apple or a pear); I show how the
analysis here can explain the behavior of such sentences in section 5.
In contrast to the pragmatic approaches, I will argue that the main free
choice implications, including especially the implications from (1a) to (1b)
and to (1c), are matters of entailment. To the extent that the analysis here
is viable, it calls into question whether free choice implications are indeed
implicatures. I discuss other entailment approaches (e.g., Aloni 2007) in
section 6.2.
The account of free choice given below will depend on understanding the
basics of Linear Logic at a fairly deep level. Since Linear Logic is unfamiliar
to most semanticists, this section will present the basics of Linear Logic.
I will only introduce the elements of classical logic that will be relevant for
comparison with Linear Logic in the discussion below. This will include
conjunction, disjunction, negation, and Weakening, but not, for example,
quantification.
Formulas. There is a set of atomic formulas a, b, c, . . . , and a set of
variables over formulas A, B, C, . . . . Assume A and B are formulas. Then the
classical negation of A, written ¬A, is a formula; the classical conjunction of
A and B, written A ∧ B, is a formula; and the classical disjunction of A and B,
written A ∨ B, is a formula. In addition, the classical implication of A and B,
written, A → B is defined as an abbreviation of (¬A) ∨ B.
Sequents. A sequent A, B, . . . , M ` N, O, . . . , Z consists of two multisets
of formulas joined by a turnstile (‘`’). Classical sequents are interpreted as
asserting that whenever all of the formulas in the leftmost multiset hold,
then at least one of the formulas in the rightmost multiset must also hold.
Saying that a sequent contains multisets rather than lists of formulas means
that the order in which formulas are written is immaterial. Thus A, B and
B, A represent the same multiset, but A, B is a different multiset than A, A, B,
since the second multiset contains two instances of the formula A.
10:4
Free choice permission as resource-sensitive reasoning
10:5
Chris Barker
(4) a. ¬¬A ≡ A
b. ¬(A ∧ B) ≡ ¬A ∨ ¬B
c. ¬(A ∨ B) ≡ ¬A ∧ ¬B
The last two (DeMorgan’s laws) express the logical interrelationship between
disjunction and conjunction. These equivalences can be thought of as bi-
directional inference rules. In any case, I will freely replace formulas with
forms deemed equivalent by (4).
Weakening. Weakening allows assumptions to be discarded.
∆`Γ
Weak
∆, A ` Γ
A`A ¬B ` ¬B
Weak Weak
A, ¬B ` A ¬B, A ` ¬B
∧
¬B, A ` A ∧ ¬B
¬1 , ¬2
A, ¬(A ∧ ¬B) ` ¬¬B
≡
A, A → B ` B
10:6
Free choice permission as resource-sensitive reasoning
10:7
Chris Barker
∆, A ` Γ ∆ ` A, Γ
⊥1 ⊥2
∆ ` A⊥ , Γ ∆, A⊥ ` Γ
Axiom
A`A
A ( B ≡ A⊥
&
B
A⊥⊥ ≡ A
(A & B)⊥ ≡ A⊥ ⊕ B ⊥ (A ⊗ B)⊥ ≡ A⊥ B⊥
&
10:8
Free choice permission as resource-sensitive reasoning
Linear conjunction and disjunction. The rules for & and ⊕ (the ‘additive’
connectives) look exactly like the classical rules for ∧ and ∨, except for the
substitution of & for ∧ and of ⊕ for ∨. However, as a result of how they
interact with the rest of the logic, the linear logic additives behave differently
from their classical counterparts. For instance, the law of the excluded
middle is valid for classical disjunction: ` (¬A) ∨ A. In Linear Logic, the law
of excluded middle is not valid for additive disjunction, despite the fact that
the inference rule for additive disjunction has the same form as the inference
rule for classical disjunction: 6` A⊥ ⊕ A. However, the excluded middle is
&
valid for multiplicative disjunction (` A⊥ A).
Linear negation. We have direct analogs to the classical rules for pushing
a formula across the turnstile, namely, ⊥1 and ⊥2 . Since we now have two
kinds of conjunctions and two kinds of disjunctions, there are more duality
equivalences; however, each conjunction is still dual to a disjunction, and
vice-versa.
Linear implication. Once again, we have defined implication in terms of
disjunction. Now, interestingly, we can prove the linear version of Modus
Ponens without using Weakening (which is a good thing, since Weakening is
not allowed in Linear Logic):
A`A B⊥ ` B⊥
⊗
A, B ⊥ ` A ⊗ B ⊥
⊥1 , ⊥2
A, (A ⊗ B ⊥ )⊥ ` B ⊥⊥
≡
A, A ( B ` B
Because the inference rule for ⊗ splits up the resources (that is, the formulas)
into those used to prove A and those used to prove B, there is no need to
ignore gratuitous assumptions via Weakening.
If we try to reproduce Wadler’s classical proof from the previous section,
we’re out of luck:
?? ` A ?? ` B
⊗
A, A ( B ` A ⊗ B
We could take some of the resources to the left of the turnstile to prove A,
and we could take some (actually, we would need all) of the resources to
prove B, but no matter how we divide up the left-hand formulas, we’ll fall
short of proving one or the other of the conjuncts. Linear Logic requires
strict accounting of assumptions, and we can’t make use of A twice, the way
we could in the classical proof.
10:9
Chris Barker
2.3 Choice
Since free choice permission is about making choices, what does Linear Logic
have to say about choice?
The critical connectives will be the additive conjunction ‘&’ and its (also
additive) disjunctive dual, ‘⊕’. The relevant inference rules are repeated here:
∆`A ∆`B ∆`A ∆`B
& ⊕1 ⊕2
∆`A&B ∆`A⊕B ∆`A⊕B
Imagine yourself in the role of the prover. Then the assumptions on the left
of the turnstile are what your environment gives you to work with, and the
conclusion on the right of the turnstile is what you return as the result of
your labors (perhaps to be used as an assumption in a larger proof).
So here is what the & inference says: if the resources in ∆ allow you to
provide A, and if the same resources allow you to provide B, then you can
certainly offer to provide either A or B. Furthermore, since you are prepared
to provide either alternative, you can leave the choice up to whoever might
be interested in making use of the conclusion. Thus & conjoins two equally
viable alternatives.
Though both alternatives are equally viable, the consumer is forced to
choose between them. For instance, imagine that ∆ contains a certain amount
of sugar and a certain number of eggs. Using the resources provided, you
can construct either a meringue or else an angel food cake, but you don’t
have enough ingredients to cook both. Being as flexible and gracious as
possible, you offer “meringue & cake” for dessert, and you let your guest
choose. Tellingly, “meringue & cake” is pronounced “meringue or cake” in
idiomatic English (this is a point that we will return to in section 7.3).
In the context of granting permission, the consumer is the entity to which
permission has been granted: we shall see that (unembedded) & corresponds
to free choice on the part of the entity given permission.
Continuing with our investigation of choice in Linear Logic, turning to
the ⊕1 inference rule, if the resources in ∆ allow you to provide A, then you
can certainly offer to provide either A ⊕ B — as long as you remain in control
of which of the alternatives is chosen. You may only know how to make
one dessert, perhaps. You can truthfully promise that dessert will either be
meringue or else Baked Alaska, although you know in advance that it will
have to be meringue. (Analogously with the roles reversed for ⊕2 .)
In the context of granting permission, offering A ⊕ B does not give the
grantee free choice.
10:10
Free choice permission as resource-sensitive reasoning
10:11
Chris Barker
10:12
Free choice permission as resource-sensitive reasoning
linear logic, in particular, the absence of Weakening, we can’t ignore the dead
postman. As a result, the combination of eating an apple and killing the
postman will land us in a situation that is far from ok: A, K, A ( δ 6` δ.
A fuller understanding of linear implication, and therefore of strong
permission, will emerge from the model theory developed in section 8.
One major expository advantage of the reduction strategy is that it enables
us to talk about permission without complicating the logic with inference
rules for and ♦. Note that we do not necessarily give up anything by omit-
ting the unary connectives: McNamara (2006) and Lokhorst (2006) show that
under appropriate additional assumptions, deontic reduction characterizes
all the theorems of standard deontic modal logics.
Not that replicating standard deontic logic should be our goal; after all,
standard deontic logic has A → ¬¬A as a tautology, which imposes a kind
of consistency on the set of deontic obligations. In the linguistics tradition,
a number of people (notably Kratzer (1991)) have argued that this is not
appropriate for describing natural language modality, and that we should
instead allow for inconsistent laws. However, I’m not aware of any reason
why deontic reduction is incompatible with Kratzer’s characterization of
deontic modality.
I should note that deontic reduction is not an innocent choice for the
empirical phenomena under consideration here. As I will explain shortly,
&
because linear implication is defined as A ( B ≡ A⊥ B, the formula for
which permission is granted (i.e., A) occurs in a downward-entailing position.
This will be crucial in deriving the desired entailments. For all I know,
however, it is possible that if a suitable notion of strong permission were
defined in a standard deontic framework (i.e., one based on unary operators
like ), similar entailments would go through.
I intend for deontic reduction to be a convenient expository choice, and
not an essential feature of a resource-sensitive approach to free choice
permission. Nevertheless, there may be some empirical support for the
naturalness of deontic reduction. After all, in addition to being able to use
a modal verb to express permission and obligation, English can also deploy
a conditional: It’s ok if you eat ‘You may eat’. In fact, in Japanese there is
no modal verb that expresses permission, and permission normally can only
be conveyed by means of a conditional construction (Clancy 1985, Akatsuka
1992): tabe-temo ii ‘eat-even.if good’, ‘It’s ok if you eat’.
10:13
Chris Barker
We can now suppose that or has among its meanings ⊕, so that You may
eat an apple or⊕ a pear translates as (a ⊕ p) ( δ: the additive disjunction
of a and p is explicitly permitted. Then the desired free-choice implication
follows directly from simple linear reasoning. Generalizing slightly by using
variables over formulas (A, B) instead of atomic formulas (a, p), we have:
` A, A⊥ ` B, B ⊥
⊕1 ⊕2
` A ⊕ B, A⊥ ` δ⊥ , δ ` A ⊕ B, B ⊥ ` δ⊥ , δ
⊗ ⊗
` (A ⊕ B) ⊗ δ⊥ , A⊥ , δ & ` (A ⊕ B) ⊗ δ⊥ , B ⊥ , δ &
` (A ⊕ B) ⊗ δ⊥ , A⊥ ` (A ⊕ B) ⊗ δ⊥ , B ⊥
& &
δ δ
&
` (A ⊕ B) ⊗ δ⊥ , (A⊥ δ) & (B ⊥
& &
δ)
⊥2 , ≡
(A ⊕ B) ( δ ` (A ( δ) & (B ( δ)
10:14
Free choice permission as resource-sensitive reasoning
5 Prohibition
The main fact to be explained is that (5a) implies (perhaps entails) (5b) and
(5c). Unlike positive free choice implications, we can usually infer that (5b)
and (5c) hold simultaneously. That is, you cannot comply with (5a) by merely
refraining from eating apples. Apparently, permission is a scarce resource,
but prohibition is all too abundant. I will call this construal of (5a) the double-
prohibition reading, and I will suggest that it arises as a standard Gricean
implicature.
As with most stories about scalar implicatures, we will be concerned with
the epistemic state of the discourse participants.
The translation of (6a) entails the translation of (6b) (that is, (6c) is a theorem),
so we predict that (6a) ought to have an interpretation on which it guarantees
that (6b) is true. Such an interpretation is widely attested in the literature,
and usually is described as favoring the continuation . . . but I don’t know
10:15
Chris Barker
(7) You may not eat this apple or this pear . . . but I won’t tell you which.
Once again, the rational course of action on the part of the younger sibling
will be to refrain from eating either piece of fruit. Presumably this is exactly
the outcome the unkind sister is aiming for. (I’m indebted to Sven Lauer for
this scenario; see also Simons 2005:273n.4.)
In both the ignorance scenario and the uncooperative scenario, at least
one of the disjuncts holds, but the choice of which fruit is prohibited belongs
to the master, not the slave. The subject of the prohibition must plan for the
worst, and therefore can’t safely commit to either alternative.
Finally, imagine that the speaker is neither ignorant nor uncooperative.
She may be an expert (perhaps she just received full instructions from the
parents) or she may be herself the source from which permission flows; in
any case, she is fully opinionated about what is forbidden. Crucially, although
(6) guarantees only one disjunct, it is consistent with situations in which
both disjuncts hold. As just argued, if exactly one disjunct held, the speaker
would simply have said so. We can deduce, therefore, that both disjuncts
must hold.
There is one more step to complete the Gricean explanation. If the speaker
intends to convey double prohibition, why not use and?
10:16
Free choice permission as resource-sensitive reasoning
A number of authors, including Schulz (2005) and Fox (2007), suggest that
free choice implications are implicatures that arise in contexts in which the
speaker is opinionated about which options are permitted and which are not.
Fox (2007) reasons as follows: if a speaker utters a disjunction when she
could have made a stronger statement, this could naturally lead to a Quantity
implicature that she did not have sufficient evidence to assert the stronger
statement. If those ignorance implicatures are implausible, as when the
speaker is describing permissions in a situation in which their judgment is
authoritative, the implausibility can trigger a repair strategy under which the
disjunction is pragmatically enriched by the application of a predicate exh
(for “exhaustive”). For instance, if an authoritative speaker says You may eat
an apple or a pear, it may be implausible that she doesn’t know whether you
may eat an apple, or whether you may eat a pear. Therefore the statement
♦(A ∨ P ) can be strengthened (given a number of additional assumptions) to
an exhaustive meaning equivalent to the proposition ♦A ∧ ♦P ∧ ¬(♦(A ∧ P )).
This asserts that you may have an apple, and you may have a pear, but you
may not both have an apple and a pear.
I will discuss three potential problems with these accounts. The first
problem is that the free-choice reading can survive even in the presence of
manifest ignorance on the part of the speaker:
10:17
Chris Barker
(10) If it turns out that John may have an apple or a pear, he’ll choose the
pear.
(11) You may eat an apple or a pear, although in fact you may not eat an
apple.
10:18
Free choice permission as resource-sensitive reasoning
eating a pear is permitted, but eating an apple and a pear is forbidden. But
as Simons (2005) and others observe, free choice is compatible with joint
permission. For instance,
10:19
Chris Barker
10:20
Free choice permission as resource-sensitive reasoning
be true, then, just in case You may eat an apple is true and You may eat a
pear is true.
The account here resembles Aloni’s alternatives account in two important
respects. First, free choice implications are entailments rather than implica-
tures. As we saw in section 6.1, the fact that free choice implications do not
always seem to be cancelable argues in favor of theories on which they are
treated as entailments.
Second, because alternative-taking may requires that ordinary may must
be true of every alternative, it is a downward-entailing operator with re-
spect to the disjunction that gives rise to the alternatives. Aloni points out
that this explains why (so-called free choice) any is licensed (e.g., You may
eat anything), and since the antecedent of linear implication is likewise a
downward-entailing position (as noted above), the same explanation carries
over here. (Of course, there is more to free choice than placing an indefinite
in a downward entailing context. For instance, a referee observes that in
some Romance languages, some free-choice indefinites are licensed under
permission, but not in the antecedent of conditionals or in other downward
entailing contexts.)
One important difference between the approach here and alternative-
based analysis, including Aloni’s, is the integration with the larger compo-
sitional system. The alternative-set approach in effect creates unbounded
dependencies in the semantics: or introduces alternatives which the compo-
sitional system must track until an alternative-aware operator collapses the
alternatives back into to a single proposition. The account here adjusts only
the denotations of the logical connectives, leaving the compositional system
entirely undisturbed. (Not that I had provided a compositional analysis,
though I trust that appropriate details can easily be supplied.)
7 Issues
10:21
Chris Barker
In parallel with the permission cases, the disjunction in (13a) entails (13b) and
(13c).
The simplest way to extend the account here to epistemic cases would
be to add to our logic a new atomic formula , which is true just in case
everything that is epistemically known holds. Then You might be in Aarhaus
would translate as A ( , and the desired entailments follow as a matter of
logic.
Adding an epsilon to the logic is more than a superficial change. It is im-
portant to keep track of what the logic claims to be modeling. Classical logic
promises to preserve truth: if the assumptions are true, the conclusion will
be true. Since truth is not resource sensitive (if something is true once, it is
true again and again), that is why it is legitimate to duplicate and discard as-
sumptions. Linear Logic promises to preserve resources: whatever resources
the assumptions provide, that is exactly what resources will appear in the
conclusion. In our deontic application, the critical resource is permission:
if the assumptions provide enough permission to eat exactly one piece of
fruit, then the conclusion will provide the same amount of permission. In
the epistemic case, the critical resource is epistemic commitment: whatever
commitments are made by the assumptions, the conclusion will make exactly
the same commitments.
There are other important differences between deontic logic and epistemic
logic. For instance, it is generally considered desirable for an epistemic logic
to guarantee that if you know that A is true, then A is true (A ` A). But
deontically, you would not want to conclude from the fact that A is obligatory
that A must hold, since obligations are all too often not fulfilled. More
relevantly, there are empirical dis-analogies between the free choice behavior
of deontic uses of modals versus epistemic modals. For instance, Kamp
(1978), Zimmermann (2000), and Aloni (2007) note that it is significantly more
difficult to construe epistemic modals as having a . . . but I don’t know which
interpretation (though it is still possible — see especially Simons 2005:274).
I’m not aware of any reason why a reduction strategy could not be part
of a more complete analysis of epistemic modality; nevertheless, it would
be prudent to be cautious about assuming that any deontic analysis should
automatically extend to epistemic cases.
In addition to the possibility that free choice effects may occur in other
modalities, Fox (2007) argues that free choice effects can be discerned in
non-modal contexts that involve existential quantifiers.
10:22
Free choice permission as resource-sensitive reasoning
Especially when (14) is heard as an implicit permissive, (14) entails both that
there is beer in the fridge and that there is beer in the cooler out back. Both
alternatives are guaranteed to be true, and the consumer of the information
has free choice of which one is relevant for forming a plan of action.
Klinedinst (2007) suggests that free choice effects are present with some
existential quantifiers, but only when the quantificational DP is plural:
In (15a), there is a reading on which some passengers got sick, and some had
difficulty breathing. On such a reading, at least some of the passengers must
have gotten sick, and at least some of the passengers must have had difficulty
breathing. But in (15b), there is no guarantee that both of the properties must
be instantiated.
Having mentioned these facts, I will not attempt a discussion here of the
interaction of free choice with quantifiers or with plurals. See Chemla 2009a
for experimental evidence and relevant discussion.
7.2 Performativity
10:23
Chris Barker
(16) You may pillage city X or city Y. But first take counsel with my secre-
tary.
Kamp (1973:67; see also Kamp 1978:279) says of this example that “[t]he
second part of this statement makes it clear that the vassal should not infer
from the first part that he may make his own choice of city. Which one he may
loot ultimately depends on the secretary’s advice, the tenor of which — we
may assume — is at this point unknown to king and vassal alike.” To be
sure, nothing specific has been permitted, and the vassal cannot form a
complete plan of action. If we conceive of a performative as something that
enlarges what an agent may safely do, we might therefore suppose that (16) is
a merely descriptive use, since it does not by itself allow the vassal to act. Yet
something must have been permitted: where does the disjunctive permission
that the sentence describes come from, if not from the performance of (16)?
As far as the current paper is concerned, it is enough for permission
sentences to characterize what is allowed. Then whether an utterance ex-
pands the sphere of permissibility depends on the interaction of the truth
conditions with the normal range of factors that influence how a discourse
participant decides to react to an utterance. Whether this minimalist strategy
is viable, or whether it will ultimately be necessary to provide a special role
for performativity remains to be seen. (See Kamp 1978 for extensive, but
ultimately inconclusive, discussion.)
The account of free choice given so far does not explain why (17b) also has a
free choice interpretation.
Simons proposes an across-the-board LF movement operation on which
the sentence with unembedded or is predicted to be logically equivalent to
You may [eat an apple or eat a pear]. That approach is compatible with the
account of free choice here.
10:24
Free choice permission as resource-sensitive reasoning
10:25
Chris Barker
The discussion so far has been conducted entirely in terms of inference rules
and proofs. It is unusual these days, though not unheard of, to express
the meaning of natural language using proof theory without giving a model
theory. More often, of course, we have the opposite situation, in which
semantic analyses provide models without any proof theory.
The most complete picture, however, emerges when proof theory and
model theory complement each other. Therefore I will discuss models for
Linear Logic here, with a detailed illustration of a free choice example.
There are a number of semantic approaches to Linear Logic. Girard’s
(1987, 1995) original semantics in terms of coherence spaces and in terms of
phase spaces would not be directly helpful here. There are other semantic
approaches, however, that have tantalizing associations with the granting
and denying of permission. I will mention three. First, Petri nets describe
the movement of tokens through a network. Lokhorst (1997) uses Petri nets
as models of his Linear Logic treatment of deontic reasoning. (Think of the
tokens as lumps of permission moving from one location to another.) Second,
in game semantics a Proponent and an Opponent take turns making choices,
and I have argued that tracking choice is central to understanding permission
talk. See, e.g., Accorsi and van Benthem 1999 for a discussion of game
semantics for Linear Logic. Third, there are computational models of Linear
Logic that make an explicit connection between the additives and choice. For
example, Abramsky’s (1993) computational semantics for intuitionistic Linear
Logic interprets A ⊗ B as an ordered pair hA, Bi both of whose elements
will be used in further computation (eager evaluation); A & B, on the other
10:26
Free choice permission as resource-sensitive reasoning
hand, denotes an ordered pair only one of whose elements will ever be used
(lazy evaluation), and of course A ⊕ B delivers a projection function that
chooses one or the other of the elements in a & pair. Unfortunately for our
purposes here, Abramsky’s computational interpretation of classical Linear
Logic involves parallel distributed processing, which would take us too far
afield.2
Most reassuringly familiar for linguists, Allwein and Dunn (1993) provide
a kosher Kripke-style possible worlds semantics, and that is the approach
that I will present here.
Following Allwein and Dunn, the expository strategy will be to begin with
an algebraic model that is faithful to the inference rules, then show how to
reconstruct that algebra in terms of worlds.
The algebraic model contains three main components: a lattice for modeling
the additive connectives, a unary operation for modeling negation, and a
binary operation for modeling the multiplicative connectives.
Additives: let A, ∧, and ∨ form a bounded lattice with partial order ≤ and
top and bottom elements. The lattice can be finite or non-finite, and it can be
distributive or non-distributive.
Negation: now let ∼ be a DeMorgan negation on that lattice. This means
that ∼ must be order-reversing (for all x, y in A, x ≤ ∼y iff y ≤ ∼x), and it
must be involutive (for all x in A, ∼∼x ≤ x).
Multiplicatives: we add a commutative, associative binary operation ◦
with identity element t (that is, t ◦ a = a = a ◦ t for all a in A). Thus A,◦, and
t form a commutative monoid. Note that t may be distinct from the top of
the lattice. The monoid operation must distribute over the join operation,
that is, for all a, b, c ∈ A : a ◦ (b ∨ c) = (a ◦ b) ∨ (a ◦ c). It must also be
compatible with negation in the sense that for all a, b ∈ A : a ◦ b ≤ c iff
a ◦ ∼c ≤ ∼b (“antilogism”).
2 Though it is intriguing to think that the meaning of some natural language expressions might
be appropriately modeled by a distributed process. Perhaps some permission sentences
denote programs which the recipient can execute in various environments in order to
produce whichever certificate of permission is required. Then a free choice permission
sentence denotes a program whose execution is blocked until it receives an external choice
(a selection of which alternative to deploy).
10:27
Chris Barker
5 ∼ ◦ 0 1 2 3 4 5
0 5 0 0 0 0 0 0 0
3 4
1 3 1 0 1 2 1 2 5
2 4 2 0 2 1 2 1 5
1 2 3 1 3 0 1 2 3 4 5
4 2 4 0 2 1 4 3 5
0 5 0 5 0 5 5 5 5 5
The Hasse diagram on the left gives the lattice order in the usual way, so that
0 ≤ 1, 1 ≤ 3, and so on. In addition, since ≤ is reflexive and transitive, we
also have 0 ≤ 0, 0 ≤ 3, etc.
Since meet (∧) in a lattice is the unique greatest lower bound, it can be
read off the Hasse diagram, e.g., 5 ∧ 5 = 5, 4 ∧ 5 = 4, 4 ∧ 3 = 0, and so on
(dually for the join operation ∨).
It is easy to see by inspection that the negation relation ∼ is involutive
(e.g., ∼∼3 = 3) and order reversing (e.g., along with 0 ≤ ∼3 we have 3 ≤ ∼0).
Note that 3 serves as the identity element t of the monoid. Since the
monoid operation is commutative, the matrix is symmetric across the top-left
to bottom-right diagonal (e.g., 4◦2 = 2◦4). Furthermore, mechanical checking
will confirm that the monoid operation is associative (e.g., (4 ◦ 2) ◦ 1 = 4 ◦ (2 ◦
1)), that it distributes over the join operation (e.g., 3◦(1∨4) = (3◦1)∨(3◦4)),
and that it respects the antilogism requirement (e.g., 4 ◦ 2 ≤ 3 ≡ 4 ◦ ∼3 ≤ ∼2).
A sequent Γ semantically entails ∆ (written ‘Γ î ∆’) just in case the
valuation of the multiplicative conjunction of the formulas in Γ is dominated
by the valuation of the multiplicative disjunction of the formulas in ∆. For
instance, since x ∧ y ≤ x for all x, y in A by the definition of meet in a
lattice, we have that A & B î A.
To illustrate how these tables provide a model of the logic, recall that we
have the following three theorems discussed in previous sections and one
non-theorem:
10:28
Free choice permission as resource-sensitive reasoning
(18) a. (A ( δ) & (B ( δ) ` (A ⊕ B) ( δ
b. (A ⊕ B) ( δ ` (A ( δ) & (B ( δ)
c. (A ( δ) ⊕ (B ( δ) ` (A & B) ( δ
d. (A & B) ( δ 6` (A ( δ) ⊕ (B ( δ)
If the given algebra is a faithful model of Linear Logic, we expect that for
every valuation v assigning a lattice element to the propositional symbols
δ, A, and B, the valuation of the left hand side of any theorem will be
dominated (in the sense of the lattice order ≤) by the valuation of the right
hand side. This is the case for (18) (a) through (c), but we have a countermodel
for (18d): if v(δ) = 0, v(A) = 1, and v(B) = 2, then v((A & B) ( δ) =
v(((A & B) ⊗ δ⊥ )⊥ ) = ∼((v(A) ∧ v(B)) ◦ ∼v(δ)) = ∼((1 ∧ 2) ◦ ∼0) = 5. But
v((A ( δ) ⊕ (B ( δ)) = 0, and 5 6≤ 0.
There are (infinitely) many other possible choices for a lattice, and for
any given lattice, there may be many choices for a suitable negation and for
a suitable monoid operation. For instance, Restall (2000:170) gives an even
simper (but still instructive) model of (distributive) Linear Logic based on a
four-element lattice. Since Linear Logic is sound and complete with respect
to the class of algebraic models given here, a sequent is a theorem iff its left
hand side semantically entails its right hand side for every valuation in every
model.
10:29
Chris Barker
10:30
Free choice permission as resource-sensitive reasoning
The next step is to associate each point in the lattice with a set of worlds.
If w is a world associated with the pair of sets of points hF , Ii, let w1 indicate
F and w2 indicate I. Then we can define a map β that takes each point p in
the lattice onto the set of worlds w such that p ∈ w1 :
β(0) = {}
β(1) = {d}
β(2) = {c}
β(3) = {b, d}
β(4) = {a, c}
β(5) = {a, b, c, d}
In other words, we map each point in the lattice to the set of worlds that
make it true.
We now need to define relations over sets of worlds that will allow us to
reconstruct the logical operations we want to model: ∧, ∨, ∼, and ◦.
The meet operation is straightforward. We extend β in the following way:
β(p ∧ q) = β(p) ∩ β(q). So meet corresponds to simple set intersection.
Thus 4 ∧ 2 = 2, and β(4 ∧ 2) = β(4) ∩ β(2) = {a, c} ∩ {c} = {c} = β(2).
The join operation is not quite so straightforward. We cannot represent
join as set union. To see why, note that 3 ∨ 2 = 5, but β(3) ∪ β(2) =
{b, d} ∪ {c} = {b, c, d} 6= β(5). The solution is to exploit the information
present in the second element in the pair of sets that define the worlds. To
do this, we define two operations on sets of worlds. Let W be our set of
worlds, and let C be any subset of W :
Although l and r are defined over all subsets of W , we will only need to apply
them in the following cases:
r (β(0)) = r ({}) = {a, b, c, d}
r (β(1)) = r ({d}) = {b, c}
r (β(2)) = r ({c}) = {a, d}
r (β(3)) = r ({b, d}) = {c}
r (β(4)) = r ({a, c}) = {d}
r (β(5)) = r ({a, b, c, d}) = {}
For instance, the reason a is not in r (β(1)) is because a2 ⊆ d2 , but d ∈ β(1).
Allwein and Dunn show that for all points p in the lattice, l(r (β(p))) = β(p).
10:31
Chris Barker
We can now define join by shifting the conjuncts using r , then taking their
intersection, then shifting back using l: β(p ∨ q) = l(r (β(p)) ∩ r (β(q))). For
instance, we have β(1 ∨ 3) = l(r (β(1)) ∩ r (β(3)) = l({b, c} ∩ {c}) = l({c}) =
β(3). Trying the problematic case given above, β(3 ∨ 2) = l(r (β(3)) ∩
r (β(2))) = l({c} ∩ {a, d}) = l({}) = {a, b, c, d} = β(5), as desired.
At this point, β, l, and r allow us to fully simulate the structure of the
lattice in terms of sets of worlds.
Representing negation: β(∼p) = {x|h∼x2 , ∼x1 i ∈ r (β(p))} (where apply-
ing ∼ to a set of points returns the set resulting from applying ∼ to each mem-
ber of the original set). For instance, we have β(∼1) = {h∼{0, 1}, ∼{3, 5}i,
h∼{0, 1, 3}, ∼{2, 4, 5}i} = {b, d} = β(3).
Note that linear negation expresses something about provability, not
about falsity. One way to see this is to observe that in this model, 3 and its
negation ∼3 = 1 are both true at world d.
Representing the tensor relation ◦ proceeds in two steps. In the usual
Kripke semantics, unary modal operators are characterized by an accessibility
relation, a two-place relation over worlds. Because the multiplicatives are
two-place connectives, we will need a three-place relation.3
10:32
Free choice permission as resource-sensitive reasoning
10:33
Chris Barker
tiny model, in the same way that a valuation for classical logic will be forced
to map very different formulas to the same truth value.)
So let’s say that I know we’re in a situation in which you are permitted
to eat an apple (say, point 4), and then I learn that you have eaten an apple.
Perhaps I watch you eat it. This changes things: I compute 4 ◦ 2 = 1.
Thanks to your eating an apple, we’re now in situation 1. And since 1 ≤ 3,
things are as they are supposed to be. In terms of worlds, δ is modeled by
worlds (situations) b and d; and since point 1 corresponds to (a singleton set
containing only) world d, we must be in a δ-world.
So, what if you are permitted to eat an apple or a pear? That’s ∼((2 ∨ 4) ◦
∼3) = 4. We just saw that if we start at 4 and you an apple, we land on a
δ-world. And indeed, if we’re at point 4 and you eat a pear instead, 4 ◦ 4 = 3,
and once again we’re in a δ-situation.
But what if you eat an apple and you eat a pear? 4 ◦ 4 ◦ 2 = 2. Situation
2 is not a δ situation, so things are not ok. Having permission to eat an
apple or a pear is not the same thing as having permission to eat an apple
and a pear. Likewise, if killing the postman is modeled by situation 4 (i.e.,
v(K) = 4), then eating an apple and killing the postman will definitely not
leave us in a δ-situation. (This small model is somewhat unrealistic, however,
in that there are situations in which eating an apple, killing the postman, and
then eating another apple is perfectly permissible.)
However, as emphasized above, having permission to eat an apple or a
pear is compatible with also having permission to eat both. Making use of
the same model, if we have v(A) = v(δ) = v(B) = 3, then v((A & B) ( δ) =
v((A ⊗ B) ( δ) = 3. With this valuation, eating apples and pears is truly
optional: you can eat an apple and stop, or you can eat a pear and stop, or
you can eat an apple and you can eat a pear, and in all three cases you’ll end
up in a δ-situation.
9 Conclusions
10:34
Free choice permission as resource-sensitive reasoning
References
Abramsky, Samson. 1993. Computational interpretations of Linear Logic. The-
oretical Computer Science 111(1–2). 3–57. doi:10.1016/0304-3975(93)90181-
R.
Accorsi, Rafael & Johan van Benthem. 1999. Lorenzen’s games and Linear
Logic. University of Amsterdam manuscript. http://www.informatik.
uni-freiburg.de/~accorsi/papers/games.pdf.
Akatsuka, Noriko. 1992. Japanese modals are conditionals. In Diane Brentari,
Gary Larson & Lynn MacLeod (eds.). The joy of grammar: A festschrift
in honor of James D. McCawley. Amsterdam: John Benjamins. 1–10.
Allwein, Gerard & J. Michael Dunn. 1993. Kripke models for Linear Logic. The
Journal of Symbolic Logic 58(2). 514–545. doi:10.2307/2275217.
Aloni, Maria. 2007. Free choice, modals, and imperatives. Natural Language
Semantics 15(1). 65–94. doi:10.1007/s11050-007-9010-2.
Alonso-Ovalle, Luis. 2006. Disjunction in alternative semantics. UMass
Amherst: PhD dissertation.
Asher, Nicholas & Daniel Bonevac. 2005. Free choice permission is strong
permission. Synthese 145(3). 303-323. doi:10.1007/s11229-005-6196-z.
10:35
Chris Barker
10:36
Free choice permission as resource-sensitive reasoning
10:37
Chris Barker
Chris Barker
10 Washington Place
New York, NY 10003, USA
chris.barker@nyu.edu
http://homepages.nyu.edu/~cb125
10:38
Semantics & Pragmatics Volume 3, Article 6: 1–54, 2010
doi: 10.3765/sp.3.6
The question addressed in this paper is a simple one: What is the difference in
meaning between singular and plural nominals in languages such as English,
where this distinction is morphologically marked?1 The issue then is to
characterize the semantic difference between the pair in (1), as it pertains to
information conveyed by the contrast in number.
In Link’s proposal, the domain of entities from which nominals take values
has the structure of a join-semilattice whose atoms are ordinary individuals
(in this case individual horses) and whose non-atomic elements are all the
possible sums of more than one atom (in this case groups of more than one
horse). Under the simple view, when a nominal is singular, the domain from
which its referent is chosen is the set of atoms in the semilattice denoted
by its head noun, while in case it is plural, its reference domain is the set of
sums in that semilattice.
1 We use nominal here as a cover term for DPs and NPs. We limit the discussion to nominals
in regular argument position, and ignore special uses in predication, incorporation, etc.
(cf. de Swart & Zwarts 2009 and references therein for discussion of such constructions).
Among the languages that manifest a singular/plural morphological distinction are the
languages within the Germanic and Romance families as well as Finno-Ugric languages such
as Hungarian and Finnish. We will not deal here with languages that make more fine-grained
distinctions in number, involving duals or paucals (see Corbett 2000). Languages such
as Mandarin Chinese that lack morphological number distinctions are briefly taken into
consideration below but do not receive a full-fledged analysis in this paper but see Krifka
(1995) and Rullmann (2003) for relevant proposals. Nor do we go into issues concerning
non-morphological encoding of number information of the type discussed for Korean by
Kwon & Zribi-Hertz (2006) or for Papiamentu and Brazilian Portuguese by Kester & Schmitt
(2007).
6:2
Singulars and Plurals
Thus, a yes answer to (3a) normally commits the speaker to having seen one
or more horses; in (3b), the addressee is expected to call even if she has seen
a single horse in the meadow, and (3c) is judged false in case Sam has seen a
single horse in the meadow. The existence of inclusive readings comes as an
unpleasant surprise to the naïve view, which predicts that the plural forms
in (3a)-(3c) are interpreted exclusively, just like the plural in (1b).
Note next that even though plurals may receive an inclusive interpreta-
tion in questions and within the scope of negation, as shown in (3a-c), the
distinction between singulars and plurals is not fully obliterated in these
environments. This is illustrated by (4a) and (4b) taken from Farkas (2006)
and Spector (2007) respectively, who note that the plural is distinctly odd in
these examples because normally people have only one nose and only one
father.
The contrast between (3a-c) and (4a-b) shows that a plural form remains
sensitive to the atom/sum distinction, even in environments where it can
be interpreted inclusively. A plural is always odd when sum values are
pragmatically excluded from its domain of reference. Ideally, this property
should follow from the account of the semantics and pragmatics of number
interpretation without any specific stipulations.
So far then we have established that an account of number interpretation
has to explain why plural forms are susceptible to both exclusive and inclusive
readings, and furthermore, one has to understand why particular linguistic
environments favor one or the other shade of meaning, while at the same time
predicting the sensitivity of plural forms to sum reference in all contexts. In
6:3
Farkas and de Swart
the rest of this section we establish some further conceptual and empirical
challenges an adequate account of number must meet and discuss some of
the most influential previous ways of dealing with them.
In Section 2, which contains the core of our proposal, we give a semantics
for the singular/plural contrast. In keeping with facts about overt morphology
in the languages under consideration, we do not make use of a singular
morpheme and therefore do not assign singular forms any inherent ‘singular’
semantics. The plural morpheme on the other hand is treated as contributing
a polysemous meaning, with the inclusive and exclusive interpretations being
its two related senses. The atomic reference of the singular comes about in
our account as a result of the competition between singular and plural forms
in the spirit of previous analyses but starting from opposite assumptions.
This competition is modelled in bidirectional Optimality Theory.
In Section 3 we account for the inclusive/exclusive interplay exemplified
by the contrast between (1b) and (3a-c) by exploiting the Strongest Meaning
Hypothesis, an independently motivated pragmatic principle. We also show
that the analysis we propose predicts that a plural form always requires the
possibility of sum witnesses, thus explaining the contrast in (4a-b) without
any extra stipulation. Section 4 shows how the analysis of languages like
English extends to an apparent puzzling use of singular forms with sum
reference in Hungarian, while Section 5 sums up the results of the paper.
6:4
Singulars and Plurals
This idea is worked out in detail in Sauerland 2003 and Sauerland, An-
derssen & Yatsushiro 2005. In Sauerland et al. 2005, there are two number
features, SG and PL, located syntactically in the head of a φP node, as in figure
1, where *boy is a number-neutral predicate, insensitive to the atom/sum
distinction:
φP
φ DP
[SG/PL] D NP
the *boy
6:5
Farkas and de Swart
6:6
Singulars and Plurals
nothing about the presence or absence of P but is used chiefly (although not
exclusively) to indicate the absence of P (Jakobson 1939)”.
Any strong singular/weak plural analysis involves an anti-Horn pattern
because in such an approach the singular forms are assigned a strong seman-
tics (requiring atomic reference, which plays the role of P above), while plural
forms are given a weak interpretation, neutral with respect to whether values
are chosen among atoms or sums. Recently, Bale, Gagnon & Khanjian (in
press) have explicitly defended the anti-Horn pattern for number, claiming
that the empirical data are only reconcilable with a negative correlation be-
tween morphological and semantic markedness. A central goal of the present
paper is to challenge the anti-Horn view, and achieve a reconciliation of the
semantics and the morphology of number, formulated in A:
Analyses that are in line with the Horn pattern in the sense that they treat the
plural feature as making a semantic contribution while treating the singular as
semantically vacuous are called here weak singular/strong plural approaches.
They are preferable on theoretical grounds to their competitors because they
explain the asymmetry in number morphology in languages that have a plural
but no singular morpheme and thus reconcile morphological and semantic
markedness. Endowing the plural morpheme with a semantic contribution
and deriving the interpretation of singular forms from the absence of the
plural morpheme makes sense of the systematic morphological asymmetry
between singular and plural forms. The existence of inclusive plural readings
constitutes the main empirical challenge for the weak singular/strong plural
approach. Before we address this problem and offer a solution, we present
data from Hungarian that appears puzzling for a strong singular/weak plural
account but not for a weak singular/strong plural view.
In this subsection we review two sets of facts that add further challenges to
any account of number interpretation. The first comes from Hungarian, a
language that displays a pattern of number marking that raises an empirical
challenge to approaches that treat singular forms as requiring atomic refer-
ence. Just like English and other Indo-European languages, Hungarian has a
singular/plural distinction:
6:8
Singulars and Plurals
6:9
Farkas and de Swart
As one can see from these examples, such DPs must be morphologically
singular. Note that these cases involve not only cardinal numerals but other
types of Ds as well. Therefore, no analysis specific to cardinals, such as
the one proposed in Ionin & Matushansky 2006, can cover all the relevant
examples. That these DPs are semantically plural can be seen from the fact
that they may occur as subjects of verbs like összegyülni ‘to gather’, as seen
in (11a). The fact that they are not necessarily distributive, and therefore that
they are, or at least, can be, referential is shown in (11b-d).8 The data are the
same for all the D types exemplified in (9).
7 Hungarian is not the only language that displays this pattern of number marking, but it is
sufficient to work out the data for one particular language to make the relevant theoretical
point.
8 We are grateful to an anonymous reviewer for drawing our attention to the data in (11b-d).
6:10
Singulars and Plurals
Finally, note that the DPs in (9) are similar to plural DPs in that discourse
pronouns referring back to them must be plural. If (13) is the continuation of
(12a) and if the three children are to be the antecedent of the direct object
6:11
Farkas and de Swart
These observations show that the DPs in (10) are semantically plural in
that they refer to sums but that morphologically, they are singular. This
characterization accounts for the data under the assumption that Subject-
Verb agreement is sensitive to the morphological feature of the DP, while the
form of a discourse pronoun is sensitive to the semantics of its antecedent
(see Farkas & Zec 1995 for discussion). The morphology explains the intra-
sentential agreement pattern these DPs trigger while their semantics explains
the form of the discourse pronouns for which they serve as antecedents.
We see then that in Hungarian, singular forms must be used in certain
cases of sum reference, a situation that is problematic for any strong singular
view. The challenge raised by the Hungarian data reviewed here is formulated
in B.
6:12
Singulars and Plurals
formulated in C.
We capture this generalization below but will not work out the semantics of
Chinese nominals since we focus here on languages that have a morphological
number distinction. Our approach to these languages is compatible with
Rullmann & You’s semantics of Mandarin.
We have seen in this section that the naïve (and attractive) view of number
interpretation we started with, according to which singular nominals refer
to atoms and plural ones refer to sums, faces two stumbling blocks: (i) the
existence of cases in which plural nominals are interpreted inclusively; (ii) the
number marking system in languages like Hungarian, where certain singular
nominals must receive a non-atomic interpretation. Retreating to a view
according to which the singular is semantically potent while the plural is
semantically empty runs against the Horn pattern of markedness, and has
difficulty with the Hungarian data as well. In the remainder of this paper, we
work out an account of number interpretation which:
ii. respects the Horn pattern and is in line with the morphological
markedness facts (generalization A);
iii. predicts when inclusive interpretations are possible (1b vs. 3 and 7b
vs. 9);
6:13
Farkas and de Swart
In this section we give our account of the interpretation of the plural feature
and its associated morpheme in the languages under consideration and
derive the interpretation of singular forms based on it. We start from what
we consider the null hypothesis, according to which, in languages with a
binary number distinction, there is a single, privative morphological feature
[Pl] in nominals and no singular feature. We assume that this feature is
generated in NumP, a node that is dominated by DP. We give the feature [Pl]
a polysemous semantics and derive the restriction of singular nominals (i.e.,
nominals that lack the feature [Pl]) to atomic reference under bidirectional
optimization. The bidirectional OT model we use is based on Mattausch
2005, 2007, a set-up that captures the harmonization of unmarked forms
with unmarked meanings, and of marked forms with marked meanings.9
Our analysis is cast in the framework of Optimality Theory (OT), a theory that
defines well-formedness in terms of optimization over a set of output candi-
dates for a particular input. OT syntax, for instance, defines grammaticality
as the optimal form that conveys a particular meaning, and thus represents
the speaker orientation (production). OT semantics picks the optimal inter-
pretation of a given form as the meaning construed by the hearer for that
form (comprehension). Bidirectional OT deals with the syntax-semantics
interface by combining the two directions in an optimization process over
form-meaning pairs (Hendriks, de Hoop, de Swart & Zwarts 2010). This frame-
work is appropriate for the problem at hand because it allows us to treat
the interpretation of singular and plural nominals in tandem, as a matter of
competition between the two forms.
As was made clear in the previous section, we treat as fundamental
to the enterprise the fact that plural forms are morphologically marked
and singular forms are not. Bidirectional OT is particularly useful to us
because Mattausch (2005, 2007) has already worked out in this framework
an abstract way of modeling the association of forms and meanings as an
optimal communication strategy that captures the Horn pattern. His proposal
9 Mattausch’s work goes back to ideas developed by Jäger (2003) and Blutner (1998, 2000,
2004). For a slightly different bidirectional OT set-up, see Beaver 2002. For a comparison of
different bidirectional OT models, see Beaver & Lee 2004.
6:14
Singulars and Plurals
6:15
Farkas and de Swart
with the unmarked form and the association of marked meaning with the
marked form thereby capturing Horn’s division of pragmatic labor. The
universal constraint ranking Mattausch derives is given in (16):
Marked forms always violate *Mark, so under the ranking in (16), they only
appear with the marked meaning β. Mattausch (2005, 2007) derives the emer-
gence of Horn’s division of pragmatic labor as the optimal communication
strategy that arises under evolutionary pressure.
6:16
Singulars and Plurals
in i. For example, the definite article is semantically more complex than the
indefinite one under analyses where the definite article is associated with a
uniqueness requirement while the indefinite article is neutral in this respect
(see Heim 1991 and Farkas 2006).
In discussions of semantic markedness in the domain of number, the
notion of denotational markedness has dominated. As we have seen already,
Sauerland et al. (2005) take the plural to denote within the entire domain of
the nominal (= *N, cf. Figure 1 above) while Bale et al. (in press) assign the
plural feature an augmentative semantics, which takes the join of all atoms
and sums in the semi-lattice of N. As a result, the denotation of the singular
(which has to have atomic reference) is a strict subset of the denotation of
the plural in both proposals. The same is true for the account in Spector
2007. Approaches which rely on denotational markedness alone then lead
to an anti-Horn analysis. However, this is not the only way one can go in
relating number interpretation to the Horn pattern.
Our analysis of number is grounded in a notion of markedness in terms
of semantic complexity. No singular feature is posited for a singular nominal,
and no inherent number semantics is assigned to this form. Plural nominals
are assumed to involve an overt plural feature [Pl] realized by a plural mor-
pheme whose semantic contribution concerns the atom/sum distinction. If
a singular nominal is not inherently associated with any number semantics
while a plural nominal comes with such a constraint, the singular form quali-
fies as semantically unmarked relative to the plural with respect to semantic
complexity.
In addition, in terms of conceptual complexity, we take atomic reference
to be less marked than sum reference. We follow Link (1983) in taking the
domain of interpretation from which variables are assigned values to consist
of atoms and sums, where the latter are built from the former by means of
the join operation ⊕. Given that atoms may exist independently of sums, but
not the other way around, a nominal that denotes within the domain of sums
is conceptually more marked than one that denotes within the domain of
atoms only.
Support for this conceptual markedness view is found in psychological
research that points to the special nature of sum reference. Recent psy-
chological research suggests that non-human primates and children under
two represent small sets of objects as object-files, and do not establish a
singular-plural distinction based on atoms vs. sums (Hauser, Carey & Hauser
2000, Feigenson, Carey & Hauser 2002, Feigenson & Carey 2003, 2005). The
6:17
Farkas and de Swart
6:18
Singulars and Plurals
The forms we are dealing with in the bidirectional optimization process are
morphologically singular and plural nominals, which we denote by sg and pl.
Sg here is short for a DP that has no number feature in its NumP, while pl
is short for a nominal that has the feature [Pl] in NumP. The interpretations
6:19
Farkas and de Swart
associated with these forms are atomic reference and inclusive or exclusive
sum reference respectively, which we denote by at and i/e sum. The bias
constraints for number are given in (17):
The ranking in (19) captures the insight that nominals marked with the
feature [Pl] must include sums within their domain of reference, and that the
interpretation of a singular form (when in competition with a plural) is atomic
reference. In line with Horn’s division of pragmatic labor, the ranking in (19)
pairs up marked plural forms with marked sum reference meanings, and
unmarked singular forms with unmarked atomic meanings. The optimization
over form-meaning pairs under this ranking is spelled out in the bidirectional
6:20
Singulars and Plurals
Tableau1, where singular and plural forms are paired up with their respective
domain of interpretation in the lattice.
All possible form-meaning combinations are listed in the first column,
and constitute the input to the bidirectional optimization process. The
interpretations that particular forms are paired up with restrict the possible
witnesses of the nominal. Atomic reference, represented as at, limits possible
witnesses to atoms only. Exclusive sum reference, represented as sum, limits
possible witnesses to sums only. Inclusive sum reference, represented as
at ∪ sum, allows witnesses to be chosen both from the domain of atoms and
that of sums.
The four bias constraints, plus the markedness constraint *functN are
ranked across the top, where the left-right order reflects a decreasing order
of strength, and follows the ranking in (19). The two bias constraints *sg, i/e
sum and *pl, at are ranked above the markedness constraint *functN, but
their mutual order is irrelevant, which is reflected in the dotted line between
the two columns. Similarly, (19) requires the two constraints *pl, i/e sum and
*sg, at to be both ranked below *functN, but their mutual order is irrelevant,
as marked by the dotted line.
Because of the set-up with the bias constraints, all form-meaning combi-
nations incur one or more violations, marked by an asterisk ∗ in the relevant
cell. The schema in (19) ranks the bias constraints penalizing the combination
of singular forms with (inclusive or exclusive) sum reference and the combi-
nation of plural forms with atomic reference above the markedness constraint
*functN, which is what drives the optimization over form-meaning pairs in
Tableau 1. The constraints mitigating against the combination of plural forms
with (inclusive or exclusive) sum reference, or the combination of singular
6:21
Farkas and de Swart
forms with atomic reference are ranked below *functN, and are de facto
inactive in the optimization process.12
Tableau 1 shows that we assign the (unmarked) singular form the (un-
marked) meaning of atomic reference under strong bidirectional optimiza-
tion, because hsg, ati constitutes a bidirectionally optimal pair (,): there is
no better form to convey atomic reference, and there is no better meaning
to associate with a singular form. The expression of sum reference calls for
the use of a plural form. Both sum and at ∪ sum qualify as sum reference, so
plural forms have exclusive or inclusive sum reference. Accordingly, both
hpl, sumi and hpl, at ∪ sumi qualify as bidirectionally optimal pairs (,). Cru-
cially, however, a plural form cannot be used in case sums are not part of the
meaning to be expressed, because hpl, ati is suboptimal. In line with Horn’s
division of pragmatic labor then, unmarked forms pair up with unmarked
meanings, and marked forms pair up with marked meanings. Given this
analysis, singular nominals have exclusive atomic reference when in compe-
tition with the plural, while plural nominals have (inclusive or exclusive) sum
reference.13
We are proposing here a weak singular/strong plural account in which
plurals are formally marked with a feature that is interpreted in compositional
semantics, as spelled out in (20), while singular nominals have no explicit
number feature and are restricted to atomic reference only as a result of the
competition with the plural form. We capture this asymmetry by assuming
that the interpretation of the feature [Pl] is as given in (20), where *P is the
number neutral property denoted by the head noun and its complement (cf.
Section 1.2 above). For any given occurrence of a plural form, either (20a) or
12 Technically, either the set of four bias constraints or the combination of *FunctN with the
bias constraints ranked above it (i.e. *sg, i/e sum and *pl/at) is sufficient to obtain three
bidirectionally optimal pairs in the ordinal Tableau 1. That is, leaving out either *pl, i/e
sum and *sg, at or *FunctN would not change the outcome of the optimization process.
However, in Mattausch’s system, we need a markedness constraint in the learning system in
order to derive a 100% form-meaning distribution in the stochastic grammar. Note also that
*FunctN plays a key role in the unidirectional optimization in Section 4.
13 The crucial difference between λx [x ∈ Sum ∪ Atom & *P(x)] and λx *P(x) is precisely the
fact that the former is semantically plural, necessitating the possibility of sum reference
while the latter is number neutral and thus truly insensitive to the atom/sum divide. We
have seen in examples (4a-b) that the plural is indeed not insensitive to the atom/sum divide,
and will work out in section 3.3 an analysis of choice of form that brings out the relevance of
sum reference for plurals.
6:22
Singulars and Plurals
(20b) holds:
Thus, nominals with the feature [Pl] are incompatible with the conceptually
unmarked meaning, namely exclusive atomic reference. The reference of
such forms is restricted by the contribution of the feature [Pl] to a domain
that includes sums either to the exclusion of atoms or not. In Section 3 we
exploit the entailment relation between these two senses to account for the
pragmatic factors that play a role in choosing one sense over the other.
Since our analysis appeals to the feature [Pl] but has not implicated
particular determiners, it carries over straightforwardly to the definite plural
in (23a):
6:23
Farkas and de Swart
their own since they combine with both singular and plural nominals. In
the case of the latter, the feature [Pl] is present in the NumP and brings its
contribution to the semantic interpretation of the DP. Given that we assume
NumP to be dominated by DP, we also assume that the feature [Pl], like other
agreement features, percolates to the DP in order to trigger plural agreement
outside the DP, as in the case of Subject-Verb agreement.
Our account leads us to expect inclusive plural possessive or definite
DPs alongside inclusive plural indefinites. Example (24) shows that this
expectation is met:
(24) [Instruction for parents picking up their kids from day care after an
outing in different groups]: If your children are back late, you have to
wait.
(25) If you have ever seen horses in this meadow you should call us.
The plural definite in (23a), on the other hand, gets an exclusive plural
interpretation on a par with that of the bare plural in (21a).
Singular nominals do not involve a singular feature in NumP and therefore
they do not have an inherent denotation restriction concerning the atom/sum
divide imposed by any of their subparts. The denotation of the singular nom-
inal horse is the number neutral property λx[*Horse (x)], an interpretation
that is insensitive to the atom/sum divide. Crucially, however, we assume
that the interpretation of count nominals in argument position in languages
with morphological number has to involve information concerning the atomic
vs. sum nature of their referent. In other words, a nominal that introduces
a discourse referent (i.e, a nominal of type e) in these languages has to be
interpreted as giving information concerning the atom/sum nature of its
possible witnesses.
Under standard assumptions in Discourse Representation Theory, dis-
course referents are introduced at the point when the D combines with its
sister(cf. Kamp & Reyle 1993; Kamp & van Eijck 1996). Because number
restrictions target the possible values of discourse referents, we assume
that it is at this point that the presence of a number restriction becomes
6:24
Singulars and Plurals
relevant. In the case of plural nominals, the interpretation of the feature [Pl]
in NumP contributes the required number restriction. In the case of singular
nominals, however, there is no explicit number feature that can contribute
the required number information. When a singular nominal combines with a
determiner that itself is not specified for number, such as the definite article
the, number specification is contributed via the optimization mechanism
given above. Such a singular DP denotes exclusively within the set of atoms
because allowing reference to sums has to involve the presence of [Pl] in
NumP according to Tableau 1. Thus, at the point when a number neutral D
such as the combines with a morphologically singular sister nominal that has
no inherent number specification either, such as horse, the system of con-
straints in Tableau 1 enriches the interpretation of the DP with the constraint
x ∈ Atom imposing exclusive atomic reference on the DP because this is the
optimal number interpretation for a DP that is not marked with the feature
[Pl]. The compositional semantics yields no number requirement on its own
but in the absence of plural morphology, the DP will be interpreted as having
atomic reference.15
Note that our account of the interpretation of singular DPs is similar in
spirit to Krifka’s account of number interpretation for plural nominals. For
Krifka, singular DPs are marked for atomic reference and plural nominals
denote in the complement of the singular forms. For us, plural nominals
are marked for including sums in their reference domain, and singular DPs,
when in competition with plurals, denote in the complement of the plural
form, i.e., they are interpreted as having exclusive atom reference.
The truth conditions of sentences like (1a), repeated here as (26a), involv-
ing a singular form in competition with a plural one, are then as given in
(26b).
6:25
Farkas and de Swart
Note that the definite article requires uniqueness, whereas the indefinite
article just contributes existential quantification. We exploit this difference
in Section 3.2 below.
The account we proposed above allows explicit semantic information
to be contributed by an unmarked form that has no inherent semantics on
the basis of the competition with a marked form with a specific semantics.
The bidirectional OT system spells out the details of a blocking account in
the spirit of Krifka (1989) and Sauerland et al. (2005) with the important
difference that in our system the existence of the semantically and morpho-
logically marked plural form affects the interpretation of the semantically
and morphologically unmarked singular form rather than the reverse
The most important challenge for a weak singular/strong plural view
is the existence of inclusive readings of plural forms. The polysemous
semantics of plural nominals that we adopted in (20) as the outcome of the
bidirectional optimization process meets this challenge as it leaves room for
both inclusive and exclusive sum reference. Following the spirit though not
the letter of Sauerland et al. (2005) and Spector (2007), we rely on pragmatics
to determine the choice between these two senses in context and give, in the
next section, a pragmatic account of the contrast between (1b), (3a-c) and (4a,
b).
16 We assume here that the atom condition is part of the semantics of the relevant DPs.
Alternatively, one could treat it as an implicature whose generation would rely on the
constraints in Tableau 1. Under both views singular DPs are taken to denote within the realm
of atoms because languages that have a plural form have to use it, other things being equal,
in case sums are among the possible referents of the DP and thus, the existence of the plural
blocks the singular from being interpreted as having sum reference. The implicature analysis
sketched here differs sharply from the use of implicature in Spector (2007) summarized in
Section 1.2 above.
6:26
Singulars and Plurals
6:27
Farkas and de Swart
is one of the factors that govern the choice between the inclusive and the
exclusive sum interpretation of plural nominals.
The Strongest Meaning Hypothesis applies when an expression is assigned
a set of interpretations ordered by entailment and chooses the strongest
element of this set that is compatible with the context.17 The two senses of
the feature [Pl] in our account, given in (20), are ordered by (truth-conditional)
strength: an existentially closed proposition involving the exclusive sense
asymmetrically entails the same proposition involving the inclusive sense.
Because of this relationship the choice between interpretations of the [Pl]
falls under the jurisdiction of SMH. Our hypothesis is formulated as smh_pl
(the Strongest Meaning Hypothesis for Plurals):
6:28
Singulars and Plurals
more than one horse (exclusive plural). Given that the inclusive interpretation
of the plural in (3c) leads to a stronger claim for the negative sentence than
the exclusive interpretation, the former interpretation is preferred under the
smh_pl. We assume here that the smh_pl is relevant to bringing about the
inclusive interpretation of plurals in questions as well, though the details of
how to compute the strength of questions must remain an open issue for
the time being, despite the fact that the affinity between downward entailing
contexts and questions has been noted for a long time.19 Other things
being equal then, the smh_pl predicts that a plural nominal is interpreted
inclusively in downward entailing contexts and questions, and exclusively in
upward entailing ones. This is indeed the situation we find in (21)-(24).
Note that the SMH as advanced by Dalrymple et al. (1998), Winter (2001)
and Zwarts (2004) is a pragmatic principle, and as such it can be overridden
by contextual pressure. If the smh_pl is indeed responsible for the choice of
interpretation for plural nominals, we expect pragmatic pressure to render
it inoperative, and make inclusive interpretations available even in upward
entailing environments. We argue below that this is indeed the case.
Under the assumption that the speaker knows the facts, the plural form
in sentences such as (21a) and (23a) will receive an exclusive interpretation
which is informationally stronger than the inclusive one. Furthermore, in
these cases there is a single relevant witness for the plural nominal. Under
the assumption that the speaker is in full possession of the facts, she should
know whether this witness is an atom or a sum. In the first case she should
use a singular form because that is the best expression for conveying atomic
reference, given the high ranking of the constraint *pl, at (cf. Tableau 1).
In the latter case, she should use the plural form, given the equally high
ranking of *sg, i/e sum. Under the assumption that the speaker knows what
Mary saw/touched then, there is no possibility to weaken (21a) or (23a) to an
inclusive plural interpretation under the bidirectional optimization process
spelled out in Section 2. But in contexts where the speaker is assumed to
in fact lack information concerning the atomic/sum nature of the relevant
witness, the smh_pl no longer requires the exclusive reading and thus inclu-
sive readings of plurals become possible even in upward entailing contexts.
19 Obviously, questions are not generally perceived as downward entailing, but they are subject
to the same principle of scale reversal, as evidenced by the well-known fact that NPIs are
often licensed in all these environments (cf. Guerzoni & Sharvit 2007 for a fine-grained
discussion of NPI licensing in questions, and Ladusaw 1996 for a general overview of NPI
licensing).
6:29
Farkas and de Swart
So far, we have proposed that the choice between the two senses of the plural
is influenced by monotonicity. In upward entailing contexts, a plural form is
normally interpreted as exclusive whereas in downward entailing contexts
and questions, scale reversal leads to an inclusive interpretation. This raises
the question of what happens in quantificational contexts.21
20 We thank one of the participants in ‘A bare workshop 2’ (LUSH, June 2008) for suggesting
the example in (28a).
21 We are grateful to an anonymous reviewer for suggesting to us to discuss the implications
of our analysis for plurals in quantificational contexts. We only discuss bare plurals and
6:30
Singulars and Plurals
If the smh_pl is indeed involved in the choice between the two senses
of the plural morpheme we expect, other things being equal, a difference
in interpretation of plurals depending on whether they are in the Restrictor
or the Nuclear Scope of a distributive universal quantifier because the Re-
strictor of such a quantifier is downward entailing and the Nuclear Scope is
upward entailing. We therefore expect the plural in (29) to favor an exclusive
reading:22
6:31
Farkas and de Swart
Sentence (30) can be used to describe a ‘mixed’ situation, in which each boy
invited all the sisters he has, which for some boys means inviting just one
sister while for others, it means inviting several. The question that arises is
how to account for the difference in interpretation between the bare plural
in (29), which seems to favor an exclusive interpretation, and the definite
plural in (30), which seems to allow an inclusive reading more readily. Section
2 developed a unified analysis of plural morphology so if bare plurals and
definite plurals behave differently here, the difference in our account can
only be due to the definite/indefinite contrast. Here we sketch a possible
explanation of the contrast in number interpretation based on the contrast
in definiteness.
The crucial difference, in our account, between (30) and (29) relates to the
contrast between (31) and (32):
6:32
Singulars and Plurals
that the maximal entity that is a sister of theirs is atomic because reference
to sum values is not allowed for a singular DP. A mixed situation in which
each boy invited the maximal entity that encompasses his sister(s) can be
described with a definite plural interpreted inclusively because in that case
the maximality requirement of the definite is met as long as for each boy
in question there is no sister that remains uninvited. The inclusive plural
requirement is met because although some witnesses are atoms there are
others which are sums. Note that, as expected, (30) cannot be used in case all
boys have a single sister whom they invited (cf. Section 3.3 below). The truth
conditions of the definite singular are incompatible with a mixed situation
where no sister is left uninvited and some boys have one sister while others
have more than one, while the truth conditions of the definite plural, under
the inclusive interpretation, are compatible with such a situation.
The indefinite singular on the other hand is truth conditionally compatible
with a mixed situation in which some boy invited one friend of his while
others invited several. This is because the predicate invited a friend of his can
be true of a boy that has several friends and invited only one of them precisely
because the indefinite, unlike the definite, has no maximality requirement. If
maximality is not part of the semantics of the sentence, the truth conditions
of a singular form are compatible with a mixed situation, where some boys
invited one of their friends and others invited several.
The contrast between (29) and (30) then is due to the fact that a ‘mixed
situation’ is incompatible with the truth conditions of (31) but compatible
with those of (32). Thus, the contextual pressure to override the smh_pl and
give the plural an inclusive interpretation when in the Nuclear Scope of a
distributive quantifier is stronger in the case of definites than in the case
of indefinites because for definites the singular form is truth conditionally
incompatible with a mixed situation while for indefinites this is not so.
We have claimed in this subsection that the choice between the inclusive
and exclusive senses of the plural is sensitive to the smh_pl, which favors
the exclusive interpretation of plural forms in ordinary upward entailing
environments and the inclusive interpretation in ordinary downward entailing
ones. Since the smh_pl is a pragmatic principle it can be overridden by
contextual factors involving cases where the speaker is assumed to describe
a ‘mixed situation’, one where some relevant witnesses are atoms and others
are sums. This may arise either because of speaker ignorance of the nature
of the relevant witness (as in 28) or because the speaker knows that both
types of witnesses are involved and using the plural form is the best way to
6:33
Farkas and de Swart
In order to account for the contrast between singular and plural forms in
examples like (33), Spector assumes an additional modal presupposition
associated with (indefinite) plurals that explicitly requires the possibility of
a sum witness. The bidirectional OT analysis developed in Section 2.2 and
25 Spector (2007) raises a further empirical issue, namely the interpretation of the plural in
exemples such as (i).
This sentence is interpreted as claiming that one student brought more than one bottle
of wine to the party and no other student brought any bottles of wine to the party. This
is a problematic example for us because one and the same plural nominal appears to be
interpreted both exclusively (in the positive part of the interpretation) and inclusively (in
its negative part). This is a problem we leave open for the time being noting that a full
discussion would have to involve both the optimal interpretation of exactly and the way
closely related senses of polysemous items interact with it.
6:34
Singulars and Plurals
6:35
Farkas and de Swart
These are clear cases since knowledge of the world tells us that people have
one nose and that eyes come in pairs. The relevance of sums, inherent to the
optimization over forms, accounts for the contrasts in (33) and (34).
There are, however, less clear cases, in which the issue of whether sums
are relevant is a more subtle pragmatic matter. Following Farkas (2006),
we adopt the hypothesis that in some cases there are default expectations
with respect to the atom vs. sum nature of relevant witnesses and that these
expectations affect the choice of a singular vs. a plural in environments
that are otherwise friendly to inclusive plural interpretations. The account
developed here accounts for this effect. To exemplify, note that when it comes
to a person having an MA degree, it is simply a default expectation that if
they have such a degree, they will have only one. Nothing stops people from
piling up multiple MA degrees in their academic career, so sum witnesses in
this case are not absolutely excluded. But normally, a person obtains just
a single MA degree, so sum witnesses are not among the expected, default
witnesses. Under the analysis developed above, we expect that the unmarked
way of inquiring whether a person has an MA degree is (35a), with a singular.
The question with the plural form is unusual because it explicitly requires
one to include sums among possible witnesses. Indeed (35b) suggests that
the speaker is inquiring after the possibility of having multiple MA degrees.
The use of the plural here signals deviation from default expectations.
6:36
Singulars and Plurals
The contrast between (35) and (36) is due to the difference in whether one
expects sums to be among the relevant witnesses or not. Since the choice of
a plural form always requires sum witnesses to be relevant, such a form is
natural in (36a) but is unusual in (35b).
The pragmatic relevance of sums also plays a role in (37a), the example
most frequently cited as support for the existence of an inclusive reading of
the plural.
The domain from which the nominal chooses witnesses in (37a) is a mixed one
since there is no default expectation with regard to how many children a per-
son has. In this case then sums are part of the pragmatically relevant domain
and therefore the choice of a plural form is predicted to be appropriate on a
tax form, for instance. In (37b) on the other hand, we changed the example so
that now the presence of sum values among the default witnesses is removed
and, as expected, a singular form is the natural one in a questionnaire in this
case.
In the examples discussed so far, common world knowledge shared
between speaker and hearer is sufficient to account for the optimal choice
of the singular or plural form. The two questions in (38) illustrate that the
choice of form may also depend on the context of use.
(38) a. Do you have a broom? (asked in your kitchen after I spilled peas
on your floor)
b. Do you have brooms? (asked in a store)
As far as we can see, there are no special expectations about people having
one or more than one broom in their house, if they have any. In addition,
given the context of use sketched for (38a), the speaker is not expected
to need more than one broom. The choice of a singular form is therefore
expected, since sum witnesses are not relevant to the situation of use nor
is the relevance of sums imposed by common world knowledge. In a store,
on the other hand, the relevant witness is by default a sum, since stores
normally sell more than one item of a particular type, if they sell that type
of item at all. A plural form then is the natural choice in (38b) not because
the speaker is interested in buying more than one broom but because of the
default sum value expectation associated with the positive answer to her
question.
6:37
Farkas and de Swart
6:38
Singulars and Plurals
6:39
Farkas and de Swart
the other hand, is like the definite article in that it has no inherent lexical
restrictions pertaining to number interpretation.
The core analysis set up in Sections 2 and 3, illustrated with English,
extends to other languages that have a morphological number distinction
such as Germanic or Romance languages as well as to non-Indo European
language such as Hungarian. In DPs whose determiner does not contribute
number information, we expect the effect of the feature [Pl] on the nominal
to be the same as in English. We now turn to the data noted in Section 1,
where we saw that English and Hungarian contrast in case the determiner is
lexically marked for sum reference.
These DPs are singular in form (and trigger singular agreement with the V
when in subject position), and yet they have exclusive sum reference. This
then is an environment where the semantic contrast between singular and
plural forms is neutralized in Hungarian.
What needs an explanation now is why in languages that have a mor-
phological number contrast if the D is marked for sum reference, we find
two options: (i) the language may require the number contrast to be mor-
6:40
Singulars and Plurals
Note that in order to obtain the intended meaning, the fact that the nominal
is singular should have no interpretive consequence here. The weak inter-
pretation of singular nominals in (44a) yields the desired interpretation, but
the strong singular semantics in (44b) does not, which is why the Hungarian
facts are problematic for accounts in which singular forms are semantically
potent while being at least compatible with a ‘weak singular’ approach such
27 The semantics spelled out in (44) may be an oversimplification, given the more fine-grained
analyses of the differences in meaning between three children and at least three children
that have been offered in the recent literature (cf. Nouwen & Geurts 2007 and references
therein). However, the observations made in these works are tangential to the issues at
stake in this paper, because they focus on the role of the determiner, not the singular/plural
distinction on the noun. So we ignore these complications here.
6:41
Farkas and de Swart
6:42
Singulars and Plurals
6:43
Farkas and de Swart
(45) Fpl: Sum reference must be encoded in the functional structure of the
nominal.
6:44
Singulars and Plurals
(46) Maxpl: Mark with [Pl] nominals that have sum reference.
6:45
Farkas and de Swart
6:46
Singulars and Plurals
5 Conclusion
6:47
Farkas and de Swart
and involves the blocking of one form by the existence of the other. We couch
it in terms of bidirectional Optimality Theory because this framework is par-
ticularly suitable for capturing the phenomenon of blocking. In Bidirectional
OT, the syntax-semantics interface is defined in terms of optimization over
form-meaning pairs, making use of a mechanism that selects the optimal
meaning for a particular form, and the optimal form for a particular meaning.
The crucial novelty of this paper is that it reverses the direction of block-
ing. We have worked out a weak singular/strong plural account of number
interpretation for the languages under consideration, in which there is no
singular feature and no special semantics associated with singular forms
while plural forms are assumed to involve a semantically potent plural fea-
ture. The main conceptual advantage of such an approach is that it reconciles
semantic and formal markedness when it comes to number interpretation
and explains why in the languages under consideration there is a plural
morpheme but no special singular marking. The main empirical advantage of
our approach is that it predicts the possibility of using singular forms with
sum reference in case the semantic distinction between singular and plural
forms is neutralized, a possibility that is realized in Hungarian.
We have adopted the abstract system developed in Mattausch 2005, 2007
and adapted it to the morphology and semantics of number. Crucially, we
have suggested that the relevant semantic markedness parameter for the
languages under consideration is the distinction between the conceptually
unmarked atom reference and the conceptually marked inclusive or exclusive
sum reference.
The system we propose associates marked plural forms with marked sum
reference interpretation and unmarked singular forms with unmarked atomic
reference. The marked plural form is associated with the requirement that
sums be included among possible witnesses of the nominal, a requirement
that is realized by giving the feature [Pl] a polysemous semantics, with one
sense reserved for the exclusive interpretation and the other for the inclusive
interpretation. The unmarked singular form has no inherent semantics, but
under bidirectional optimization, it takes the complementary meaning of the
marked plural which is exclusive sum interpretation.
We have proposed a weak singular/strong plural approach in which formal
and interpretational markedness are parallel, a pattern we find elsewhere
in natural language. At the same time, our proposal meets the challenge
posed by the existence of plural forms interpreted inclusively. In fact, once
we adopt the view that having sum reference is the conceptually marked
6:48
Singulars and Plurals
interpretation, the account makes us expect plural forms to be used both for
inclusive and exclusive sum reference. What the system rules out, however,
is a plural form used when the existence of a sum witness is excluded. This
is a welcome result. The relevance of sum values to all uses of plurals in
the languages under consideration follows from our analysis without having
to assume a strong semantics for singulars (as in Sauerland et al. 2005) or
having to add a special modal presupposition for plurals (as in Spector 2007).
In our approach, just as in previous proposals, the competition between
the inclusive and the exclusive interpretation of plural forms is decided by
pragmatic rather than semantic factors. We have relied on applying the
Strongest Meaning Hypothesis to the interpretation of the plural, which
correctly predicts that plural forms will be interpreted exclusively in ordinary
upward entailing contexts and inclusively when under the scope of negation
or in the Restrictor of conditionals or distributive universals. But even in
contexts in which inclusive readings are permitted, sum reference must be
relevant. We have seen that subtle pragmatic factors determine in which
contexts and situations sum reference is relevant.
The main theoretical contribution of the account we developed here is
that it respects the Horn pattern while at the same time accounting for the
existence of inclusive plurals as well as for the main dividing line between
inclusive and exclusive plurals. We have shown here that such an account
is both possible and desirable. On the empirical side, our approach has
the advantage of accounting for the relevance of sum reference with plural
forms, as well as predicting the possibility of singular nominals with sum
reference just in case sum reference is imposed by D independently of what
is found in NumP. This is indeed the case of Hungarian singular DPs such as
sok gyerek ‘many child’. We have presented an account of these facts that
treats the singular form of these DPs as the result of the language valuing
functional economy over the pressure to mark sum reference uniformly with
the feature [Pl]. The obligatory plural forms of such DPs in English is due to
this language valuing uniform [Pl] marking of sum denoting DPs higher than
functional economy. The account we propose then meets what we take to be
the main challenges number semantics faces without having to rely on any
tools that are not independently motivated.
6:49
Farkas and de Swart
References
Bale, Alan, Michaël Gagnon & Hrayr Khanjian. in press. On the relationship
between morphological and semantic markedness: the case of plural
morphology. Journal of Morphology http://linguistics.concordia.ca/bale/
pdfs/Morphology%20paper.pdf.
Beaver, David. 2002. The optimization of discourse anaphora. Linguistics and
Philosophy 27(1). 3–56. doi:10.1023/B:LING.0000010796.76522.7a.
Beaver, David & Hanjung Lee. 2004. Input-output mismatches in OT. In
Reinhard Blutner & Henk Zeevat (eds.), Optimality theory and pragmat-
ics, 112–153. Palgrave/MacMillan. https://webspace.utexas.edu/dib97/
publications.html.
Blutner, Reinhard. 1998. Lexical pragmatics. Journal of Semantics 15(2).
115–162. doi:10.1093/jos/15.2.115.
Blutner, Reinhard. 2000. Some aspects of optimality in natural language in-
terpretation. Journal of Semantics 17(3). 189–216. doi:10.1093/jos/17.3.189.
Blutner, Reinhard. 2004. Pragmatics and the lexicon. In Laurence Horn &
Gregory Ward (eds.), Handbook of pragmatics, 488–514. Oxford: Blackwell.
Chierchia, Gennaro. 2004. Scalar implicatures, polarity phenomena and the
syntax/pragmatics interface. In A. Belletti (ed.), Structures and beyond,
39–103. Oxford: Oxford University Press.
Chierchia, Gennaro. 2006. Broaden your views: implicatures of domain
widening and the "logicality" of natural language. Linguistic Inquiry 37(4).
535–590. doi:10.1162/ling.2006.37.4.535.
Chierchia, Gennaro, Danny Fox & Benjamin Spector. 2008. The grammati-
cal view of scalar implicatures and the relation between semantics and
pragmatics. In Klaus von Heusinger, Claudia Maienborn & Paul Portner
(eds.), Semantics. An international handbook of natural language meaning,
Mouton de Gruyter, New York, NY.
Cohen, Ariel. 2005. More than bare existence: An implicature of existential
bare plurals. Journal of Semantics 22(4). 389–400. doi:10.1093/jos/ffh031.
Corbett, Greville G. 2000. Number. Cambridge University Press, Cambridge.
doi:10.2277/0521640164.
Dalrymple, Mary, Makoto Kanazawa, Yookyung Kim, Sam Mchombo & Stanley
Peters. 1998. Reciprocal expressions and the concept of reciprocity.
Linguistics and Philosophy 21(2). 159–210. doi:10.1023/A:1005330227480.
Farkas, Donka F. 2006. The unmarked determiner. In Svetlana Vogeleer &
Liliane Tasmowski de Rijk (eds.), Non-definiteness and plurality, 81–106.
6:50
Singulars and Plurals
6:51
Farkas and de Swart
Blutner & Henk Zeevat (eds.), Pragmatics and optimality theory, 251–287.
Houndmills: Palgrave MacMillan.
Jakobson, Roman. 1939. Signe zéro. In Mélanges de linguistique offerts à
charles bally, Genève (also in Selected Writings II).
Kamp, Hans & Jan van Eijck. 1996. Representing discourse in context. In
Johan van Benthem & Alice ter Meulen (eds.), Handbook of logic and
linguistics, 179–237. Amsterdam: Elsevier.
Kamp, Hans & Uwe Reyle. 1993. From discourse to logic. Dordrecht: Kluwer
Academic Publishers.
Kester, Ellen-Petra & Christina Schmitt. 2007. Papiamentu and Brazilian
Portuguese: a comparative study of bare nominals. In Marlyse Babtista
& Jacqueline Guéron (eds.), Noun phrases in creole languages: a multi-
faceted approach, Amsterdam: Benjamins.
Kirby, Simon & Jim Hurford. 1997. The evolution of incremental learning:
language, development and critical periods. In Antonella Sorace, Caroline
Heycock & Richard Shillcock (eds.), Gala ’97 conference on language
acquisition, HCRC, Edinburgh University.
Kouider, Sid, Justin Halberda, Justin Wood & Susan Carey. 2006. Acquisition
of English number marking: the singular-plural distinction. Language
Learning and Development 2. 1–25. doi:10.1207/s15473341lld0201_1.
Krifka, Manfred. 1989. Nominal reference, temporal constitution and quantifi-
cation in event semantics. In Renate Bartsch, Johan van Benthem & Peter
van Emde Boas (eds.), Semantics and contextual expression, Dordrecht:
Foris publication.
Krifka, Manfred. 1995. Common nouns: a contrastive analysis of English and
Chinese. In Greg N. Carlson & Francis Jeffry Pelletier (eds.), The generic
book, 398–411. Chicago University Press. http://amor.rz.hu-berlin.de/
~h2816i3x/.
Kwon, Song-Nim & Anne Zribi-Hertz. 2006. Bare objects in Korean:
(pseudo)incorporation and (in)definiteness. In Svetlana Vogeleer & Lil-
iane Tasmowski de Rijk (eds.), Non-definiteness and plurality, 107–132.
John Benjamins, Amsterdam.
Ladusaw, William. 1996. Negation and polarity items. In Shalom Lappin
(ed.), The handbook of contemporary semantic theory, 321–341. Oxford:
Blackwell.
Lakoff, Robin. 2000. The language war. University of California Press,
Berkeley.
Link, Godehard. 1983. The logical analysis of plural and mass nouns: a lattice-
6:52
Singulars and Plurals
6:53
Farkas and de Swart
Herburger & Paul Portner (eds.), Negation, tense and clausal architecture:
Cross-linguistic investigations, 199–218. Georgetown: Georgetown Univer-
sity Press. http://www.let.uu.nl/~Henriette.deSwart/personal/negot.pdf.
de Swart, Henriëtte. 2010. Expression and interpretation of negation: an OT
typology. Dordrecht: Springer (in press).
de Swart, Henriëtte & Joost Zwarts. 2008. Article use across languages: an
OT typology. In Atle Grønn (ed.), Sinn und Bedeutung, vol. 12, 628–644.
University of Oslo.
de Swart, Henriëtte & Joost Zwarts. 2009. Less form, more mean-
ing: why bare nominals are special. Lingua 119(2). 280–295.
doi:10.1016/j.lingua.2007.10.015.
de Swart, Henriëtte & Joost Zwarts. 2010. Optimization principles in the typol-
ogy of number and articles. In Bernd Heine & Heiko Narrog (eds.), Hand-
book of linguistic analysis, Oxford: Oxford University Press. http://www.
let.uu.nl/~Henriette.deSwart/personal/oupdeSwartZwartsmay08.pdf.
Winter, Yoad. 2001. Plural predication and the strongest meaning hypothesis.
Journal of Semantics 18(4). 333–365. doi:10.1093/jos/18.4.333.
Wood, Justin, Sid Kouider & Susan Carey. 2004. The emergence of sin-
gular/plural distinction. Poster presented at the biennial International
Conference on Infant Studies. http://www.wjh.harvard.edu/~lds/index.
html?carey.html.
Zwarts, Joost. 2004. Competition between word meanings: the polysemy of
around. In Sinn und Bedeutung, 349–360. Konstanz. http://www.let.uu.
nl/users/Joost.Zwarts/personal/.
Zweig, Eytan. 2008. Dependent plurals and plural meaning: NYU dissertation.
http://www-users.york.ac.uk/~ez506/.
6:54
Semantics & Pragmatics Volume 3, Article 2: 1–13, 2010
doi: 10.3765/sp.3.2
Uli Sauerland
Zentrum für Allgemeine
Sprachwissenschaft, Berlin
Both Chemla (2009b) (C in the following) and Geurts & Pouscoulous (2009a)
(G&P in the following) in recent papers in this journal provide welcome new
experimental evidence on embedded implicatures. However, while their work
takes us a couple of steps closer to full understanding of the issue, I will
argue that in both papers the theoretical implications of the new data are
overstated and much work remains to be done.
Intuitively clear cases of embedded implicatures are examples like (1).
(1) a. If you ate some of the cookies and no one else at any, then there
must still be some left. (Levinson 2000: 205)
b. Mary solved the first problem or the second problem or both
problems. (Chierchia et al. 2008: (31))
∗ I thank Nicole Gotzner, Lisa Hartmann and the editors of this journal for their help with
this paper, and the German Research Foundation (DFG grant SA 925/1 in the Emmy Noether
Programm) for financial support.
Here the implicatures of some and or are part of the truth-conditional content
of an embedded sentence: the conditional in (1a), which is understood as
if you ate some and not all of the cookies, and a disjunct in (1b), which is
understood as Mary solved either the first problem or the second problem and
not both. In these examples, the sentence without the embedded implicature
would be either contradictory (If you ate some or all of the cookies, then there
must still be some left) or a violation of a pragmatic constraint (#Mary solved
at least one of the problems or both problems, see Singh 2008).
The question theorists of all stripes are faced with is whether and how to
integrate these phenomena into a general theory of implicatures, or at least
of quantity or scalar implicatures. Some narrower directions that have been
pursued to address the general question raised by embedded implicatures
are listed in (2).
G&P and C both focus on the first of these directions, and take this
discussion as far as it can be taken presently. However, I argue that just
pursuing the first direction is insufficient to resolve the issues embedded
implicatures raise fully. In particular, I show that independent pragmatic
constraints — in particular, the constraint of Truth Dominance (Meyer &
Sauerland 2009) — predict conditions on when embedded implicatures can
be detected that are largely independent of the account of embedded impli-
catures assumed. Therefore, the observations on the presence and absence
of (embedded) implicatures by G&P and C are consistent with much wider
range of theories of implicatures than what the original papers say. In the
second section of the paper, I address a finding by C on the embedding of
free-choice effects that speaks to directions II and III of (2). I argue that this
finding is more significant for the account of embedded implicatures and
speculate on two theoretical ideas that would account for it. I conclude that
C’s second result is the most important one for the theory of implicatures
from these two papers.
2:2
Embedded Implicatures and Experimental Constraints
G&P are exclusively concerned with the first issue of (2). The primary target of
G&P is an extreme view of localism espoused by Levinson (2000) and Chierchia
(2004).1 The Levinson/Chierchia view predicts that implicatures should
always be fully local unless a cancellation mechanism applies. A number
of people have noted that this prediction seems to be intuitively wrong in
many cases. Specifically, this holds in case the embedded implicature is
not needed to make the sentence coherent (see for instance Geurts 2009;
Russell 2006; Sauerland 2004b). Compare (3) with (1): Intuitively, (3a) does
not seem to mean the same as If you ate some but not all of the cookies,
then you must have liked them. And for the multiple disjunction in (3b), the
paraphrase Mary solved either exactly one or all three of the problems, which
local computation of implicatures predicts, is clearly off the mark.
(3) a. If you ate some of the cookies, then you must have liked them.
b. Mary solved the first problem or the second problem or the third
problem.
2:3
Uli Sauerland
Levinson (2000) and Chierchia (2004) are falsified by the experimental data
to the extent possible.3
G&P claim their results also argue against another view, which they call
Minimal Conventionalism. However, I will show that G&P are mistaken:
Actually, their result says nothing about Minimal Conventionalism once we
take into account general pragmatic constraints on how ambiguous sentences
are judged. To show this, I consider the view of Fox (2007) as a concrete
example of G&P’s Minimal Conventionalism. I motivate the general pragmatic
principle of Truth Dominance and then argue that G&P’s data are fully
consistent with Fox’s (2007) account once Truth Dominance is taken into
account.
Fox’s (2007) account is non-committal on the locality of implicature com-
putation, allowing it to apply locally, but also globally. He assumes that
implicatures can be contributed to the meaning of a sentence by the gram-
matical operator Exh.4 (Fox 2007: p. 79 & p. 97) defines the Exh operator via
the three statements in (4) through (6) (with minor notational adjustments).
The operator depends on a contextually provided set of alternative proposi-
tions C, which can be taken to be the scalar alternatives of the argument of
Exh in the examples in the following.
Consider now Fox’s (2007) account for example (7). The account predicts
an ambiguity between a local+global reading, which corresponds to structure
(8a), and a global-only reading, which corresponds to structure (8b).
(7) All the squares are connected with some of the circles. (G&P: (26a))
3 Of course, there are always ways to save any scientific theory by adding additional assump-
tions, but nothing short of almost obligatory local cancellation of the proposed obligatory
local implicatures would seem to do the trick in this case.
4 There are two major differences between Fox’s account and that of Chierchia (2004): First,
Fox does not require local application of implicature computation. And second, his Exh
operator is different from Chierchia’s due to the appeal to innocent excludability. The second
difference does not matter for the following, but is important for the analysis of disjunctions
(Sauerland submitted).
2:4
Embedded Implicatures and Experimental Constraints
(8) a. Exh All the squares λx Exh x be connected with some of the
circles.
b. Exh All the squares λx x be connected with some of the circles.
Meyer & Sauerland (2009) explain the lack of evidence for the every
only reading by arguing that this reading cannot be detected for pragmatic
reasons. Specifically, the Truth Dominance principle in (10) predicts it to be
undetectable: Because the strong, only every reading entails the weak,
every only reading, any situation where the truth values of the two readings
differ is one where the strong reading is false while the weak one is true. But
Truth Dominance predicts that in such a situation, speakers will judge the
sentence to be true, as it’s predicted to be by the weak reading. The strong
reading therefore remains undetectable in the truth conditions of (9).
5 The principle can be traced back at least to work on wide scope indefinites by Abusch (1994).
Gualmini, Hulsey, Hacquard & Fox (2008) call a similar principle Charity. The differences
between the Charity and Truth Dominance are not relevant to the discussion in this paper. In
fact, Charity would make exactly the same predictions as Truth Dominance for the examples
in the following.
2:5
Uli Sauerland
2:6
Embedded Implicatures and Experimental Constraints
The results of C (= Chemla 2009b) add one new aspect, but are otherwise
consistent with the picture already summarized: The results show that
obligatory localism is false, but don’t distinguish between other views. In
particular, C’s discussion of examples like (11a) and (11b) is limited in the same
way as the discussion of (9) by G&P: the only theory Chemla’s result argues
against is the extreme localism of Levinson (2000) and Chierchia (2004),
which is already known to have numerous problems. More viable views of
localism, where embedded implicatures are an option, but not required, make
exactly the right predictions for both examples in (11) — namely, the same
predictions as a global account.
The most interesting result of C’s study is the embedded free choice
effect in examples like (12). He shows experimentally that subjects judge (12)
to entail that every student is allowed to have an apple and also that every
student is allowed to have a banana.
2:7
Uli Sauerland
representations in (13), while the latter globalist view permits only represen-
tation (13b)
Chemla focuses on the fact that the Fox’s globalist view incorrectly pre-
dicts that (12) should be restricted to scenarios where not all students make
the same choice since (13a) entails that neither every student is allowed to
have an apple nor every student is allowed to have a banana. What Chemla
fails to note, though, is that the optionally local version of Fox’s proposal also
predicts (13a) as a possible reading for (12). In particular, neither does (13a)
entail (13b), nor vice-versa, and therefore both readings should be detectable.
But this doesn’t seem to be the case and therefore (12) is also a problem
for the non-deterministic version of Fox’s proposal, not just for the global
one. Since I argued above that both version are independently problematic,
Chemla’s new evidence just strengthens the point against both proposals.
The main conclusions I draw from Chemla’s paper concern a) the status of
free choice inferences, and b) the relation of embedded to global implicatures.
Chemla’s data only speak to my questions in (2) if we assume that free
choice inferences are indeed implicatures. Chemla’s data actually cast this
relationship in doubt. There is not that much empirical evidence in favor
of the relationship in the first place: the main direct piece of evidence in
favor of an implicature account of free choice inferences is the observation
by Kratzer & Shimoyama (2002) that the inferences disappear in the scope of
negation just like implicatures.8 However, C shows two differences between
free choice inferences and implicatures: First, only free choice inferences are
locally present in the scope of a universal quantifier as I already referenced
above. Second, C shows that negated modalized statements like (14a) don’t
trigger free choice inferences. Since (14b) is logically equivalent to (14a) the
absence of a free choice inference in (14b) shows that free-choice inferences
are not detachable in the sense of (Grice 1989). Usually implicatures are
detachable as, for example, Grice already discusses.
8 Furthermore, free choice inferences can also be cancelled like other implicatures as in You
may have an apple or a banana, but I don’t know which.
2:8
Embedded Implicatures and Experimental Constraints
C’s results are intuitively plausible and very interesting for the theory of
free-choice inferences. As far as I can see, there are two possible directions to
pursue. On the one hand, one could seek to treat free choice inferences not as
implicatures. Specifically, C’s result could be seen to support non-implicature
accounts of free choice such as Zimmermann (2000). On the other hand, it
may be that matrix free-choice effects are still implicatures, but embedded
free choice effects may be due to a special free-choice inference generating
operator. The latter position should be attractive to those who believe that
there are satisfying analysis of free-choice inferences as an implicature (Fox
2007; Schulz 2005).
3 Conclusions
In sum, the recent experimental work by G&P and C has confirmed the
views of those who have argued against the obligatory localism of Levinson
(2000) and Chierchia (2004), e.g. Geurts (2009), Russell (2006), and Sauerland
(2004b). Beyond that, the account of embedded implicatures and their
relation to global implicatures are still unclear. Solely testing for the presence
of embedded implicatures as G&P and C mostly do may be insufficient for
understanding embedded implicatures. Rather it may be more promising to
investigate wether the content of embedded implicatures is exactly the same
as that of implicatures at the matrix level.
In this direction, the difference C observes between embedded and matrix
implicatures is interesting and most likely helpful in sorting out the puzzle
of embedded implicatures. While I have no complete account to offer myself,
I close with some arguments to be skeptical of Geurts & Pouscoulous (2009b)
analysis of Chemla’s example (12): Geurts & Pouscoulous (2009b) suggest
accounting for (12) as an instance of an embedded speech act. This account, I
argue now is plausible for some cases, but most likely cannot cover all cases
of embedded implicatures: The possibility of embedded speech acts has
been acknowledged at least since Huddleston (1973) and embedded speech
acts can certainly be a source of embedded implicatures:9 (15) illustrates
that embedded speech acts must trigger embedded implicatures: the modal
particle wohl (‘well’) requires an embedded speech act interpretation for the
complement of glaubt (‘believes’) and furthermore triggers an inference that
9 The idea of a metalinguistic negation of Horn (1985) is closely related to the idea of embedded
implicatures, but more specific since it assumes a restriction to negation.
2:9
Uli Sauerland
(15) #Bill glaubt, dass einige der Kinder wohl krank sind. Aber alle
Bill believes that some of the children wohl sick are but all
Kinder sind krank.
children are sick.
However, this alone doesn’t predict correctly that (15) is odd. The oddness
of (15) is only predicted if there is also an embedded implicature. The
embedded implicature is the reason that the stronger belief, that some, but
not all children are sick, is attributed to the speaker. Then (15) is predicted
to be odd because the second sentence explicitly contradicts this attribution
of the embedded implicature to the speaker.
This example indicates that embedded speech acts trigger embedded
implicatures as all theories of speech acts would predict.10 However, I do
not believe that the reverse entailment also holds — that an embedded im-
plicature is always triggered by an embedded speech act. One problem for
this entailment is the following: Krifka (2001) argues that most does not
allow embedding of speech acts in its scope. Hence, (16) should not allow
embedded free choice inferences unlike (12). However, this doesn’t accord
with my intuitions: (16) suggests that most students can choose freely. For
instance, consider (16) in the following scenario: the majority of students
can freely choose between A and B and the few other students, who cannot
freely choose, must do option A. In this scenario (16) seems acceptable to
me, even though it may happen that not a single student chooses option B.
The acceptability of (15) in such a scenario is only expected if the free choice
inference is embedded in the scope of most.
2:10
Embedded Implicatures and Experimental Constraints
data in (12) are still in need of an account. And the search for such an
account may finally really lead us to a better understanding of embedded
implicatures.
References
2:11
Uli Sauerland
2:12
Embedded Implicatures and Experimental Constraints
245–260. doi:10.1007/s10988-008-9038-x.
Zimmermann, Thomas Ede. 2000. Free choice disjunction and epis-
temic possibility. Natural Language Semantics 8(4). 255–290.
doi:10.1023/A:1011255819284.
Uli Sauerland
Zentrum für Allgemeine Sprachwissenschaft
Schützenstr. 18
D-10117 Berlin
uli@alum.mit.edu
2:13
Semantics & Pragmatics Volume 3, Article 8: 1–57, 2010
doi: 10.3765/sp.3.8
Eric McCready
Aoyama Gakuin University
Abstract
This paper provides a system capable of analyzing the combinatorics of a
wide range of conventionally implicated and expressive constructions in nat-
ural language via an extension of Potts’s (2005) LCI logic for supplementary
conventional implicatures. In particular, the system is capable of analyzing
objects of mixed conventionally implicated/expressive and at-issue type,
and objects with conventionally implicated or expressive meanings which
provide the main content of their utterances. The logic is applied to a range
of constructions and lexical items in several languages.
1 Introduction
The nature of conventional implicatures has been under debate since their
existence was proposed by Grice (1975). Some philosophers deny that there
are such things at all (Bach 1999). In linguistic semantics, however, there
has been a recent surge of interest in their analysis, starting with the work
of Potts (2005). The work of Potts in this area has centered on conventional
implicatures that provide content which supplements the main, at-issue
content of the sentence in which they are used.
∗ Thanks to Daniel Gutzmann, Yurie Hara, Makoto Kanazawa, Stefan Kaufmann, Chris Potts,
Magdalena Schwager, Yasutada Sudo, Wataru Uegaki, Ede Zimmermann, and audiences at
NII, Kyoto University and the University of Göttingen for helpful discussion, and in particular
to three anonymous reviewers for Semantics and Pragmatics, as well as David Beaver and
Kai von Fintel, for extremely useful and insightful comments.
Here, the content of the nominal appositive in (1a) and that of the speaker-
oriented adverbial in (1b) add content to the utterance, but in a way intuitively
independent of the claim the speaker intends to make by her utterance. No-
tice also that the appositive and adverbial only introduce conventionally
implicated content; they add nothing to the ‘at-issue’ content. This is charac-
teristic of all the elements studied by Potts.1
A number of authors (e.g. Bach 2006; Williamson 2009) have noted that
not all lexical items (or constructions) are associated exclusively with at-
issue content, or with conventionally implicated or expressive (CIE) content;2
instead, some expressions seem to introduce both. Pejoratives are the most
widely cited example. Williamson discusses an example from Dummett (1973),
the (extinct) pejorative Boche, which according to Williamson was in use in
Britain and France in the initial stages of WW1 in anti-German propaganda.
This choice is presumably made to avoid other expressions that are more
obviously offensive to the modern reader. However, the obsolete nature of
Boche makes it difficult to have clear intuitions about sentences in which it is
used. I will therefore make use of the pejorative Kraut instead, as an example
of a pejorative that, while still attested, is probably milder and less offensive
than some other possible choices.3 In any case, all instances of pejoratives in
this paper are data; they are mentioned, not used.
(2) He is a Kraut.
Pejoratives plainly introduce what I will call mixed content: they are pred-
icative of at-issue content, yet introduce a conventional implicature. I will
1 It is still possible that these expressions could be presuppositional in nature, rather than
part of a separate class of conventionally implicated meanings, as suggested by a reviewer.
I find the arguments of Potts on this issue (2005, 2007a, and 2007b) convincing, but I will
return to the point below.
2 ‘CIE content’ is intended as a neutral term for conventionally implicated and expressive con-
tent. In this paper, the assumption is made that, to a first approximation, both conventional
implicatures and expressives make use of roughly the same combinatoric system. Where the
distinction matters, I will not use the cover term.
3 I thank David Beaver and Kai von Fintel for helping me with the difficult choice of which
pejorative would have the desired qualities of being both relatively common and relatively
inoffensive. Hom (2008), faced with a similar decision, makes use of Chink, which is perhaps
fairly similar in quality.
8:2
Varieties of conventional implicature
8:3
Eric McCready
2 Mixed Content
8:4
Varieties of conventional implicature
2.2 Pejoratives
8:5
Eric McCready
saying (2), repeated as (3), I assert that the referent of he is German, and
express that I have negative feelings about him.
(3) He is a Kraut.
Here, ‘Kraut’ obviously must contribute to at-issue content: if it does not, the
sentence cannot form a proposition, for the pejorative is the main predicate
of the sentence. The same can be seen when pejoratives serve as subjects.
Here, the pejorative term is serving as the first argument to the determiner
(on a standard semantics). Pejoratives thus clearly form part of at-issue
content.
The expression of negative feeling that the word introduces, though, is not
part of at-issue content. This can be seen by considering the characteristics
of conventionally implicated and expressive content as discussed in Potts
2005 and Potts 2007a. Potts lists a number of properties that these kinds
of content are meant to have, some of which have been called into question
by various authors (e.g. Wang, Reese & McCready 2005; Wang, McCready &
Reese 2006; Geurts 2007; Amaral, Roberts & Smith 2008). In this paper I
will primarily consider two tests for conventional implicature/expressiveness
(CIEness). The first is scopelessness. The second is the behavior of CIE items
under denial.
CIE items, by definition, do not participate in at-issue semantic processes.7
In particular, they are not affected by semantic operators. Consider the
following examples.
8:6
Varieties of conventional implicature
In these examples, it is clear that the content of the nominal appositives is not
affected by the negation or by the conditional, and similarly for the expressive
adjective damn. In this respect, CIE content is similar to presupposition.
It differs in that it cannot be bound (cf. van der Sandt 1992). ‘Binding’
refers to the situation in which a conditional antecedent (or other universal
construction) entails the content of a presupposition which appears in the
consequent. In this situation, no presupposition is projected.
Consider what happens when one denies a sentence containing CIE content.
As the following examples show, the CIE content cannot be the target of
denial.
8:7
Eric McCready
8:8
Varieties of conventional implicature
any way at all.10 Instead, what is expressed by the sentences in (12) is that
the speaker takes German people to be bad.11 Presumably the sense that the
subject individual is negatively characterized that Williamson picks up on is
derived via an inference: since it is asserted that he is German, and expressed
that German people are bad, it is also expressed, though indirectly, that he is
bad. But this does not seem to be a part of literal content, either at-issue or
CIE.
Supposing then that the expressed content of Kraut is roughly that
German people are bad, we can test its bindability via a conditional in the
usual way.
This sentence is rather odd, in part because the expressed content of Kraut
does indeed appear to project from the conditional.12 On the assumption that
the proposed paraphrase is the right one, and generalizing from this case,
we can conclude that the expressed content of pejoratives is CIE rather than
presupposed. I will assume so in the following. It should be noted, however,
that the significance of the result of the binding test depends on the accuracy
of the paraphrase. If the paraphrase given is incorrect, or, even worse, if the
expressive portion of pejoratives is such that it does not admit a linguistic
paraphrase at all, then the test is invalidated. This is worrisome given the
analysis of Potts (2007a), according to which expressives have the property
of ‘ineffability,’ meaning that they literally cannot be paraphrased in ways
not involving other expressives.13 Even in this case, though, an expressive
paraphrase is possible:14
10 Unless one takes it to be a bad thing that one is not, or might be (etc.), a German; I will ignore
this notion in the following.
11 This may well be what Dummett had in mind.
12 A reviewer suggests that the oddity is due to the speaker apparently expressing uncertainty
about his own attitudes, which should be pragmatically inappropriate. However, even if the
speaker is an amnesiac who in fact does not know what his attitudes are (in some sense), the
oddity remains, suggesting that this is not the right explanation.
13 Geurts (2007) notes that something similar holds for other, non-expressive words like
green, though: they are not easily given satisfying paraphrases either. See also Fodor 2002.
However, the degree of difficulty seems to be different for the cases of green and (e.g.) damn.
A paraphrase of the latter cannot even be attempted without using expressives, whereas one
can (for instance) try to give exemplars of greenness for the former. I think Potts is right in
distinguishing the two types. I will have more to say about this issue in the conclusion.
14 A reviewer notes that the projection behavior may not be very surprising, given that we also
have expressive content in the antecedent, which has nothing to bind it. The fact that it
8:9
Eric McCready
Here, if one accepts the Richard analysis, the expressive content of ‘Kraut’ is
pretty clearly entailed15 by the content of ‘(I) hate the damn/fucking Germans.’
The conclusion is that this part of the content of Kraut is not presupposed,
which indicates that it is highly likely to be CIE content, given its other
behavior.16
Let us now consider the second test. What happens when one tries to
deny the content of a pejorative?
The result of this test also supports the conclusion that the negative part of
the meaning of Kraut, and, by extension, pejoratives in general is CIE content,
and not part of the at-issue meaning.
To sum up, we have reached the conclusion that pejoratives play a dual
semantic role: they act as ordinary nominals for predication or as arguments
of determiners, etc., but carry CIE content as well. They also appear to be
monomorphemic, at least in many cases. One might argue (as has Chris Potts,
p.c.) that in fact pejoratives are polymorphemic. An argument for such a
view comes from pejoratives like Jap, which could be viewed as composed of
is necessary to use expressives to paraphrase other expressives (given Potts’s ineffability
condition) may be one reason that binding of CIE content is impossible.
15 Or some expressive equivalent for the Potts 2007 system. Since according to that analysis
the function of (emotive) expressives is to narrow down a subinterval of R used as a model
of a range of emotion displayed with respect to some object, one can define a notion of
emotive entailment according to which P x emotively entails Qx iff the interval assigned to
x by P is a subset of that assigned to x by Q. Since I will not make use of this system in this
paper, I will not work out the details.
16 A reviewer suggests an analysis in terms of indexical presuppositions (Schlenker 2007), with
the following lexical entry:
(i) Krautc,w = λx : speaker(c) has a negative attitude toward German people in w(c).
German(x)
But this suggests (as far as I can see) that such a presupposition should be bindable in
examples like (ii).
Again, in amnesia contexts, this should be felicitous; and here the content certainly projects.
8:10
Varieties of conventional implicature
2.3 LCI
Potts (2005) proposes a pair of logics called LCI and LU for the analysis of
conventional implicature.19 These two logics interact in sometimes complex
ways. The parts of the system that concern us here involve a) what kinds
of expressions are semantically well-formed, b) how these expressions are
combined in the logical syntax, and c) how the resulting expressions are
interpreted. These issues all relate to LCI , which is a higher-order lambda
calculus. The first corresponds to a definition of admissible types in LCI
and the second to rules for how the admissible types are combined. The
third issue corresponds to a rule for the interpretation of conventionally
implicated expressions: effectively a mapping between expressions of LCI ,
the type theory used for the combinatorics, to logical forms intended for
model-theoretic evaluation. I examine each in turn. As we will see, the system
as set up in Potts’s work cannot be used to model the behavior of mixed
content expressions, which will prompt modifications to it in section 2.3.
First, the types themselves. Potts defines a system of types. Here, as in
17 As in ‘Yankee Go Home’ — I make no claims about the historical development of the term.
18 It is still debatable whether the precise content I have proposed (following Richard) is right.
Hom (2008) gives an interesting analysis in which pejorative content is not expressive at
all, but instead is a social construct varying across speaker groups. I will not argue in
detail against this proposal here — I am sympathetic to the notion of social construction
of meaning, at least in these sorts of cases — but I doubt that all the content of pejoratives
is truth-conditional. Hom considers and rejects the sort of evidence (denials and operator
scope arguments) I have made use of here. In my opinion he is too hasty in doing so, but
fully responding to his arguments would take us too far afield.
19 I will not review the full motivations for these logics here, or all the details of how they work.
I will focus only on the parts that will be necessary for the proposal in this paper.
8:11
Eric McCready
the type theories standardly used in linguistic semantics (cf. Heim & Kratzer
1998), basic types are e, t, s, which are used to produce an infinite set of
types via the usual kind of recursive definition. (The details of the definition
are provided in Appendix A.) However, Potts’s logic differs in that it makes
crucial use of a distinction between at-issue types and CI types (‘CI’ indicating
conventional implicature). The distinction is indicated via a superscript ‘a’
or ‘c’ on the type name. At-issue types are freely produced in the usual way.
CI types are distinct: they are always of the form hσ a , τ c i, functions taking
at-issue typed objects as input and outputting CI-typed objects. There is no
mechanism for producing types that take CI-typed objects as input. This,
according to Potts, is the reason that conventionally implicated content is
independent of at-issue operators: there simply are no operators over CI
content.
How are these objects combined? LCI has the derivation rules for type
combination shown in Figure 1. Potts couches them as ‘tree admissibility
conditions’ but this comes out to more or less the same thing as a derivation
rule if one understands his trees as proof trees: the Table 1 notation is more
compact, so I will use it in what follows. As far as I am concerned this is a
notational variant. It should, however, be noted that the logic behaves in ways
that are odd from the standpoint of many logics familiar to linguistics such
as categorial grammar; notably, unlike the categorial grammars implemented
for standard at-issue semantic combination, it is not resource sensitive for
CI types, as detailed below. The essential point is that a resource sensitive
logic is one that consumes resources as they are used in proofs. This is a
property of the combinatorics of at-issue content: combining sleeps with
John yields sleeps(john), but the meanings of noun and verb are consumed
and no longer available for further composition. As we will see, this is a
property that LCI rightly lacks.
The rules in Figure 1 are meant to model the combinatorics in conjunction
with a syntactic structure, just as in the work of Potts, meaning that they
should retain the constituency-driven character of the original LCI rules.20
(R1) is just a reflexivity axiom. (R2) is ordinary application for at-issue
20 I also diverge from Potts on my treatment of CI propositions introduced low in a tree.
In Potts’s formulation, the possible presence of such additional CI conditions warrant
sometimes thinking of these rules as shorthand for a larger rule set. See Potts 2005: 222 for
details. Instead of this route I will consistently make use of R5 to eliminate all elements of
type t c from derivations immediately after they are derived, which means that there will not
be extra free-floating CI content. Thanks to a reviewer for inspiring this strategy.
8:12
Varieties of conventional implicature
α:σ
(R1)
α:σ
α : hσ a , τ a i, β : σ a
(R2)
α(β) : τ a
α : hσ a , τ a i, β : hσ a , τ a i
(R3)
λX.α(X) ∧ β(X) : hσ a , τ a i
α : hσ a , τ c i, β : σ a
(R4)
β : σ a • α(β) : τ c
β : τ a • α : tc
(R5)
β : τa
α:σ
(R6) (where β is a designated feature term)
β(α) : τ
8:13
Eric McCready
8:14
Varieties of conventional implicature
(16) Proof tree interpretation (after Potts). Let T be a proof tree with at-
issue term α : σ a on its root node, and distinct terms β1 : t c , . . . , βn : t c
on nodes in it. Then the interpretation of T is hα : σ a , {β1 :
t c , . . . ,βn : t c }i.
Here α and β are variables over lambda terms, and σ a is a variable over
semantic types. The superscripts distinguish the types as either at-issue
(superscript a) or CI (superscript c). Effectively, conventionally implicated
content is shunted into a separate dimension of meaning. The bullet therefore
functions as a bookkeeping device in the proof.
The action of these three elements of the Potts logic, then, is as follows.
First, types for conventional implicature are defined; crucially, there are no
types that take conventionally implicated content as input. Second, these
types are combined via the rules in (R1-6). With respect to conventional
implicatures, this means the effect is to isolate conventionally implicated
content from at-issue content with a bullet, by rules (R4) and (R5). •-terms
are then separated into separate dimensions of meaning, by the schema in
(16).
Let us consider how this logic can be used for the analysis of mixed
content objects. It is easy to see that it cannot be so used in its current
form, given the assumption that the at-issue and CIE content are introduced
by the lexical item simultaneously. The type construction rules (again, see
Appendix A for details) provide for types of the form hσ , τia , purely at-issue
types, and hσ , τic , purely CI types. Intuitively, in the case of pejoratives
we require an object with the type of an ordinary predicate in the at-issue
dimension, and one of propositional type which is CIE.23 What we need is
a typing for objects that are of mixed type, but this cannot be produced in
LCI . As far as I can see, the only way to model mixed content in LCI would
be to assume that content can be introduced in two distinct stages. This
on semantic derivation trees and semantic derivation proofs should yield the same results,
given that the mechanisms of derivation are equivalent. I do not see a substantial difference
in giving derivation trees citizen status and giving the same kind of status to proof trees. In
any case, the proof-based rule is less odd in the context of derivations proceeding in concert
with a syntax, and problems that could arise with e.g. λ-abstraction will not arise in the
context of CIE content, where (as far as is known presently) abstraction does not occur. Still,
if the reader feels happier with using trees, she is welcome to perform the translation, which
is technically trivial.
23 If one follows e.g. Williamson and takes pejoratives to introduce predicates in the CI
dimension as well, the situation changes somewhat, but the basic problem is the same. We
will see cases of this type in section 2.6.
8:15
Eric McCready
This is the desired logical form. But this kind of approach requires
allowing mixed content objects to separately introduce multiple pieces of
content. This analysis seems to destroy the intuition that pejoratives and
other instances of mixed content are singular semantic objects with a dual
character. It indeed strikes me as highly unnatural to have a lexical entry
realized in terms of multiple, fully separate entities.24,25 I therefore take it to
be truer to intuitions to modify the logic in such a way that mixed content
can be modelled directly. This is done in the following section.
2.4 L+
CI
This section of the paper proposes L+ CI , an extension of LCI that can handle
+S
mixed content. In the process, we will also define a sublogic of L+ CI , LCI ,
which introduces a set of types for CIE objects that have resource-sensitive
properties.
The first necessary step involves adding resource-sensitive CIE types to
LCI . The reason is that there are mixed content items which are predicative
in both dimensions. Pejoratives introduce mixed content: but only part of
this content, the at-issue portion, is predicative (or so I have argued). The
CIE content is propositional. Because it is propositional, there is no special
24 The case of presupposition may seem formally similar on a superficial level, but it is rather
different in that presuppositions (on some perspectives at least) simply indicate definedness
conditions for the at-issue content, whereas here the two bits of content are entirely separate
and represent fully distinct discourse contributions.
25 Note also that the proposed analysis is different from analyzing single lexical items as
consisting of a single complex condition; the two types of decomposition are entirely
different in quality. Assigning a word a meaning of the form λx[P (x) ∧ Q(x)] seems rather
different from giving it a pair of meanings λx[P (x)] and λx[Q(x)] which are meant to
apply to the input at different points in the derivation. The latter seems appropriate in only
special situations, e.g. when a word makes two distinct contributions that can be traced
back to specific distinct parts of the word. We will return to such examples in section 2.6,
where I will discuss the general merits of the decompositional strategy.
8:16
Varieties of conventional implicature
need for resource-sensitive types here; but in cases where there is a dual
predication, a lack of resource sensitivity will cause serious problems in the
meaning composition, as I will detail shortly. It is not hard to find cases
of mixed content where both the at-issue content and the CIE content are
predicative. An instance can be found in the Japanese honorific system.
Certain honorifics in Japanese come with special morphology which clearly
carries the honorific load; these sorts of expressions are analyzed by Potts &
Kawahara (2004) as introducing a kind of expressive content. In such cases,
it is easily possible to analyze the morphemes as introducing supplementary
expressive content exclusively. However, there are other lexical items which
simultaneously honor some individual and predicate something of her. An
example is irassharu ‘come[Hon]’.
Here, the verb simultaneously says of the teacher that she came, and indicates
that she is deserving of honor.26 This verb satisfies both the criteria for
mixed content: it introduces both an at-issue predication and expresses
honorification at the CIE level.27 Further, the verb is (at the surface at least)
monomorphemic. It cannot be separated into morphemes introducing at-
issue and expressive content separately, unlike (for instance) the honorifics
studied by Potts & Kawahara (2004), which clearly contain morphemes which
separately provide honorific meanings. This does not of course preclude a
decompositional analysis, on which more below. But, barring independent
(synchronic) reasons for such an analysis, it seems desirable to analyze this
expression as simultaneously introducing two types of meaning, and so as a
bearer of mixed content.
The upshot is that honorifics like irassharu are instances of mixed content
which are predicative in both dimensions of meaning. How could such exam-
ples be analyzed in LCI ? Note what will happen if we make the obvious move,
and analyze this expression as involving an object of at-issue predicative
type, and a CIE object of similar type, conjoined by a bullet as usual:
26 Or however one wishes to paraphrase the honorific relation; I will not address this question
here in detail. See section 2.4 for some brief discussion.
27 For arguments that honorific content is expressive, see Potts 2005, Potts & Kawahara 2004,
and Kim & Sells 2007.
8:17
Eric McCready
(18) irassharu= λx. come(x) : he, tia • λx. honor(s, x) : he, tic
Applying this object to the referent of sensei ‘the teacher’ (which I will treat
as a referring expression for simplicity) yields the following by R4, or would
if R4 was defined for expressions conjoined by the bullet operator, which it
actually is not. If we wanted to extend R4 to cases of •-conjoined objects, we
would actually need to define a new rule. Let us see what such a rule would
be for purposes of discussion. This rule simply assumes that we perform
pointwise application of every element conjoined by a bullet according to the
proper rules, which will be R2 for the at-issue side of the bullet and R4 for
the CIE side. The use of R4 of course means that the content of the input to
the CI type will be duplicated in the output, yielding the results of the two
applications, and an unmodified input as well.
α : hσ a , ρ a i • β : hσ a , τ c i γ : σa
(19)
α(γ) : ρ a • β(γ) : τ c • γ : σ a
With this rule we can attempt a derivation of (17), which will go as follows.
α : hσ a , τ s i, β : σ a
(R7)
α(β) : τ s
8:18
Varieties of conventional implicature
We can then modify the rule in (16) to handle information from shunting
types as well. σ {x,y} indicates that σ is a type of sort x or sort y.28 We will
see a number of examples of the application of this rule in what follows.
The combination of (R7) and the new interpretation rule in (20) serves to main-
tain the original generalizations about supplementary meanings provided by
LCI while expanding the system’s coverage to conventional implicatures that
introduce the primary meaning of the sentence they appear in. In section 3, I
will show that the possibilities made available by the existence of these types
are exploited by natural language, even outside the domain of mixed content.
The resources to create the needed kind of objects to model mixed content
are obviously already present in L+S
CI . We already have what we need: at-issue
types and CI types. We need only a way to produce product types across the
two dimensions, and then an application rule telling us what to do with such
types when we have them. I will now provide these tools; the resulting type
system is called L+CI .
It is rather simple to add the relevant types. We need only a single
typing rule producing mixed types. This rule is provided in Appendix B.2. It
produces types of the following form:
This object is a product type where the conjoined types are an at-issue
type and a shunting type.29 Note that the input to the at-issue type and
the shunting type need not be of the same semantic type; this means that
it is in principle possible that the situation arises where the two will have
incompatible inputs. Such typings will not work in composition though, as
28 I thank Yasutada Sudo for helping me to correct an infelicity in an earlier version of this
definition.
29 These objects are rather similar to the dot objects of Pustejovsky (1995), as already mentioned
in footnote 21. The difference is that, in Generative Lexicon theory, trying to make use of
both ‘sides’ of the dot object generally results in zeugmatic infelicity as in (i), so there is no
rule like (R8) even in the extended system (Asher & Pustejovsky 2005).
8:19
Eric McCready
they will not be interpreted by any rule, which will rule them out in practice.
Mixed types like these are paired with λ-terms of the form α _ β: ‘_’ (hereafter
‘diamond’) signifies a semantic object of mixed type. We now need rules for
interpreting these types. I propose the following two.
α _ β : hσ a , τ a i × hσ a , υs i, γ : σ a
(R8)
α(γ) _ β(γ) : τ a × υs
Given as input a mixed type and an object of the at-issue type that is input
to both conjoined elements in the mixed type, (R8) outputs the result of
applying each element of the mixed type to the input, where both objects are
conjoined with ‘_’ as before. An example of this is precisely the derivation of
mixed content terms, where both CIE content and at-issue content look for
objects of the same type as input; we will see many examples in the coming
sections. We will need one further rule telling us what to do with mixed terms
when the CIE part of the derivation is complete: this is provided as R9.
α _ β : σ a × ts
(R9)
α : σ a • β : ts
This rule instructs us to replace mixed type terms involving the conjunction
‘_’ with terms conjoined by a ‘•’ when the CIE object is propositional (of type
t). Roughly, we have a change in bookkeeping device corresponding to a
change in typing: the diamond indicates that the two terms it conjoins are
still ‘active’ in the derivation, but the bullet indicates that the CIE side has
already gotten all its arguments and is ready for interpretation. R9 thus,
in a sense, moves shunting-typed terms out of active use. Doing so allows
for interpretation via the rule in (20). Again, we will see examples in the
following sections.
At this point, it is possible to abstract away from the honorific example
provided earlier to make clear the general need to use shunting types on the
CI side of the mixed type. Recall that the CI types in LCI are not resource
sensitive; they always return their at-issue input as well as the result of
applying the CI type to this input. (R4) yields an object of the type σ a • τ c
when an functional CI type hσ a , τ c i is applied to something of type σ a . But
this means that, if we use CI types, then in the terms typed as α(γ) _ β(γ) :
τ a × υc yielded by a variant of (R8) which uses CI types, the object to the
right of the diamond will be of the form γ : σ a • β(γ) : υc itself due to (R4),
as we have seen. This means that the result of the application is of the
form α(γ) _ γ : σ a • β(γ) : υc .’ We have seen an instance of this with the
8:20
Varieties of conventional implicature
attempted (and failed) derivation of (17) above. This means that there is an
‘unused’ term of type σ a floating around in the derivation, which will result
in ill-formedness. We do not want this, and we can avoid it by using shunting
types on the right-hand side instead. Such types remove the terms they apply
to from the at-issue dimension completely, which clearly is what is needed in
this case.30
With this rule and the type system in Appendix B.2, we are able to provide
an adequate semantics for lexical items that introduce simultaneously at-
issue and conventionally implicated content, by defining objects of mixed
at-issue and CI types.31 The next section shows in detail how this can be done
for pejoratives, and the following section, 2.6, how it applies to other parts
of natural language in which we find mixed content.
30 If one takes the intuitive interpretation of shunting types to be ‘main conventionally impli-
cated content,’ then the definition of mixed types indicates that there are two kinds of ‘main
content’ in mixed-type sentences. I myself do not find this very counterintuitive.
31 A reviewer asks whether we need CI types at all anymore, given the new system. The
suggestion is that one could make all types for CIE objects use the format of mixed types,
but just provide a tautological component on the at-issue side, for instance the identity
λX.X for polymorphic types. I do not see any technical reason this could not be done,
though there might be reasons one would want to make a clear distinction between mixed
and unmixed types in the type system. In any case, the comment shows that L+ CI is in fact a
genuine extension of LCI . Thanks to the reviewer for picking up on this point.
8:21
Eric McCready
The reviewer finds these grammatical and suggests that they are problematic, because
only the CIE content distinguishes the two categories in each case. This is an interesting
observation, but speakers I have consulted (including myself) find the examples infelicitous.
I myself feel they are contradictory, especially (ii). I therefore will not modify the theory
to address them. But one suggestion might be that, for those that find such examples
OK, there is some content present in the pejoratives in addition to the CIE content which
distinguishes the two properties; perhaps it is even the case that some of the CIE content
has been reanalyzed as at-issue. I will not speculate further.
8:22
Varieties of conventional implicature
associated with conventional implicatures, but, since they also denote at-
issue content, they can serve as main predicates and are affected (in part)
by various semantic operators. It does not seem at all difficult to find such
expressions; in fact, many examples are noted in the literature. Let us begin
by returning to the Japanese mixed content honorifics discussed in section
2.3. There I discussed the honorific irassharu, which has the at-issue content
of an ordinary motion verb and the CIE content that the speaker honors the
individual denoted by the sentential subject. In L+CI , this can easily be given
an analysis. 33
(23) irassharu= λx. come(x) _ λx. honor(s, x) : he, tia × he, tis
Given this lexical entry, we can see that the honorific will participate in
composition in much the same way that (predicative instances of) pejoratives
do. The difference will, of course, be that predication takes place in both
at-issue and CIE dimensions. An example is the following.
No inconsistency is felt here, despite the epithet in the second sentence; and the second
person who came is not honored, consistent with the conclusions of the squib.
8:23
Eric McCready
We can now consider the details of what one would have to do to analyze
these examples with only the type resources of at-issue and CI types. This
makes the need for shunting types even more obvious than before. I can
see two ways to allow for this in principle in LCI , only one of which involves
modifying the logic at all. The first, as with the propositional part of pejora-
tive meanings, involves letting mixed content elements introduce separate
pieces of content. Then we could simply stipulate that CI application takes
place before at-issue application, yielding a two-step composition process
for mixed type objects. This ordering must be introduced to exploit the
non-resource-sensitivity of CI types. We would get roughly the following,
supposing that both at-issue and CI content is of type he, ti.
which in turn yields the meaning hQa, {P a}i by the interpretation rule in
(16). Effectively, this idea amounts to analyzing mixed content terms as two
completely separate lexical objects, one at-issue and one CI, as can be seen
from the fact that in the semantic derivation this application would have
to take place on two distinct nodes. Notice also that the two parts of the
content must be separated in the combinatorics for things to work out. I take
it that this option is entirely undesirable, just as in the case of pejoratives.
However, there may be arguments for this style of analysis in certain cases; I
will discuss some below, and also evaluate the whole style of this approach
as a possibility for the general analysis of mixed content bearers.
A second option would be to add a new composition rule to LCI and add a
means of producing mixed types, but not to introduce shunting types, instead
making use of only the standard Pottsian CI types, σ c .34 Together with this,
we would require a composition rule for ‘mixed bullet types,’ necessary in
order to avoid the unwanted duplication of content that would result from
allowing the application of R4, as discussed in section 2.2. This rule would
have to look roughly like the following. This can be viewed as an attempt
to solve the problems introduced by the rule (19), which of course caused
difficulties stemming from lack of resource sensitivity.
34 The rule for producing such types is the obvious analogue of B.2.1.i in which ‘•’ is substituted
for ‘_’ and all instances of shunting types are replaced with CI types.
8:24
Varieties of conventional implicature
(25)
α • β : hσ a , τ a i × hσ a , υc i γ : σa
α(γ) : τ a • β(γ) : υc
The crucial point here is that the benefactive introduces both a causative
at-issue meaning and a conventional implicature to the effect that the caused
event benefited the causer. Again, this expression satisfies both criteria for
mixed content bearing: it is both monomorphemic and introduces content
along two dimensions. This is plainly an instance of mixed content.
35 I follow Kubota and Uegaki’s glosses and morphological analysis.
8:25
Eric McCready
This lexical entry is of mixed type; derivations with it will proceed via the
rules (R8), for the combinatoric steps, and (R9), for the final step which shifts
the mixed content to something interpretable via (20). Here is the derivation,
with types and rules of proof only.37
(28) hc _ honor(s c , hc ) : ea × t s
I make use of just an honorific relation here, following Potts & Kawahara
(2004). I do not want to take a position on its content here because mere
use of a pronoun need not indicate that the addressee is actually honored. It
is difficult to decide exactly what should be made of insincere uses of such
pronouns. Potts & Kawahara (2004) analyze Japanese subject honorifics as
36 The term morat-ta ‘Ben-Pst’ is derived from mora-u ‘Ben-Npst’ via morphological operations
that are of no concern to us here.
37 π1 and π2 here are the usual projection functions/pullbacks on product types, which work
to pick out the first or the second element of the product type, respectively.
8:26
Varieties of conventional implicature
8:27
Eric McCready
8:28
Varieties of conventional implicature
This generalization can be taken to mean that colored terms have denotations
of a similar type to the subject honorifics discussed earlier. We can give them
lexical entries as follows.
(30) a. steed= λx. horse(x) _ λx. noble(x) : he, tia × he, tis
b. nag= λx. horse(x) _ λx. useless(x) : he, tia × he, tis
The meaning of this modifier has two parts. First, it performs intensification
in the at-issue dimension, so (31a) means that the referent of that is extremely
(or ‘totally’) interesting; but the speaker also indicates that she holds some
emotive attitude toward the sentential content. This latter part is expressive
or conventionally implicated, and indeed bears the usual hallmarks of emotive
expressive meanings: for example, it is highly context dependent with respect
to positivity and negativity.41 McCready and Schwager further provide a
formal semantics for the intensifier in L+ CI . The analysis is complex, and I
will not review it here; but it is at least clear that ur passes the tests I have
proposed for mixed content bearers.
41 Footnote 50 discusses the issue of context dependence of emotive meanings further.
8:29
Eric McCready
I suppose that there are many other kinds of mixed content, but most
have not come to the attention of researchers yet. The previous discussion
should at least show the usefulness of the notion. There is plainly much more
work to be done on the range of conventionally implicating and expressive
items in the world’s languages, but I hope that the small sample given here
and in the previous section show that the type-theoretic tools proposed here
have useful application in their analysis.
3 Main CIEs
The logic proposed in the previous section, L+ CI , does more than allow for the
analysis of mixed content. The introduction of shunting types that was shown
to be necessary for that purpose also makes available another possibility for
semantic denotation. As we have seen, the result of composition with mixed
terms is similar in the end to the addition of supplementary information via
conventional implicatures: this similarity is modeled by letting both sorts of
CIE content be conjoined to at-issue content via the bullet. Shunting types,
though, because of their resource sensitivity, allow for a situation where
there is no at-issue content at all. The aim of this section is to show that this
feature of the logic should not be taken as a negative one.
The existence of shunting types implies that it is possible that a particular
sentence (or utterance) can convey only CIE content. We will examine several
cases where this situation appears to be realized. In general, this situation
is somewhat special; the uses of language most often analyzed in linguistic
and philosophical work serve to convey information about the world, rather
than to express aspects of the speaker’s mental state or meta-information
about the conversation, which (arguably) is the function of conventional
implicature. Information about the world is thus conveyed mostly by default
here, or in ways other than via the conventional implicature itself, e.g. when
the ‘primary’ content is present in the context, or entered into it by other
means. This observation suggests a division in content type which we will
find to be borne out, at least at the level of inspection that I can provide in
the present context.
The discussion is structured as follows. In section 3.1, I briefly show
why shunting types imply that CIE content can be primary. Section 3.2
examines a first case, the basic cases of single-word utterances of particles
of the kind introduced in Kaplan 1999. There it is also shown that these
cases exhibit unexpected behavior from the perspective of LCI in that they
8:30
Varieties of conventional implicature
can fall in the scope of certain semantic operators. As it turns out, the
existence of shunting types makes it possible to allow for these cases while
simultaneously retaining Potts’s generalizations about the interaction of
semantic operators and CIE content. Section 3.3 discusses the Japanese
adverbial yokumo, which exhibits a different kind of behavior: while the
denial test supports an analysis of the content of sentences containing this
adverbial as CIE, there is composition within the adverbial scope, unlike what
is found with Kaplan’s particles (as noted by Kratzer 1999). It is shown that
analyzing yokumo as being of shunting type both provides an explanation
of its behavior with respect to denials. 3.4 concludes with some suggestions
about possible related phenomena.
The reason that shunting types allow for utterances with only CIE content
is the resource-sensitivity of these types. The function of shunting types
is to ‘shunt’ at-issue content into the CIE dimension of meaning; because
of the resource-sensitivity of these types, no at-issue content remains. Any
successful derivation will result in an object of type t s . Here is a sample, with
two applications:
α : σa γ : hσ , hτ, υiis
R7
β : τa γ(α) : hτ, υis
R7
γ(α)(β) : υs
8:31
Eric McCready
Potts (2005) as well, for precisely the same reasons; I therefore modify (20) to
cover the case where the utterance contains only content of type t c as well. I
will simply stipulate that in cases where a sentence lacks asserted content
it is still interpreted as a 2-tuple, but one with a first (left) element which is
always satisfiable. I will denote this trivial assertion by T . The result of all
this is a definition with two distinct cases, one which applies when there is
an asserted proposition, and one which applies when there is not.
The second issue is less easily resolved. We have a fairly good idea of
what conditions there are on assertion and what norms govern this speech
act. But these norms do not necessarily apply when there is no asserted
content present in an utterance. What then are the norms of the use of
sentences which have CIE content as their primary content?43 This is a
difficult question and one which might be asked about all uses of CIE content.
It is not really clear at this point exactly what the normative conditions are
on the use of supplementary CIEs, for example. A full answer is therefore
far beyond the scope of this paper. I can only suggest a path toward an
answer here. It seems that what the ‘norms of expression’ are depends on
what kind of act is at issue. In assertion we are, roughly, concerned with the
transmission of true information. If a sentence is false, then a norm has been
violated. With respect to CIE content, one can think of a notion of ‘expressive
correctness,’ following Kaplan; the question then becomes what exactly it
takes for something to be expressively correct. The answer to this turns on
what one takes the function of CIEs to be. It is not clear to me that we have
the necessary understanding of their function yet. Once we do, we will be in
a better position to articulate the norms of expressive use.
Let us now turn to some empirical facts, focusing on particles and adver-
bials.
43 Thanks to Kai von Fintel (p.c.) for raising this question.
8:32
Varieties of conventional implicature
3.2 Particles
(33) Man!
This kind of case is discussed briefly by McCready (2008b). There man was
taken to be a conventional implicature-introducing propositional modifier
that applies to a proposition made available by context. If one agrees with this
analysis (and if one follows the analysis of proposition-modifying sentence-
initial man offered in that paper) one ends up with an undesirable situation
where both man(φ) and φ are directly communicated. The reason is that
man would end up being analyzed as of type ht, tic , which means that one
ends up with the denotation ϕ : t a • man(ϕ) : t c for the sentence. Intuitively,
though, this is not correct: ϕ is not asserted by sentences like the above. To
see this, consider cases where a question is answered with the particle:
8:33
Eric McCready
about the weather. It is not easy to see which of these options is correct, for
it’s not clear that there are empirical tests to distinguish between the two
positions.44 However, as we’ll see, either approach proves to give support to
an analysis of particles that takes them to denote objects of shunting type.
Clearly, on either analysis, stand-alone particles provide another case
where the conventionally implicated content is the primary content of the
utterance. If we assume that a proposition is being directly modified, man
can be typed as
λp. man(p) : ht a , t s i
ignoring the actual content of the particle, which is roughly that the speaker
has some kind of emotional reaction toward p (that it is good or bad).45
This analysis disallows the assertion of p itself, as desired. The question
of how extensively we should take particle meanings to be analyzable in
terms of shunting types is left for another occasion; it turns on the empirical
question of whether or not the propositional content of sentences modified
by particles can serve as answers to questions. In many cases it is clear that
they can, in others, perhaps not.
Another kind of even more obvious case is that of expressives that do not
perform any modification, such as salutations or fully expressive exclama-
tions (cf. Kaplan 1999; Kratzer 1999). On the second analysis of stand-alone
particles like man, they too will fall into this category.
(35) a. Thanks!
b. Good morning.
c. Ouch!
Expressions like these lack truth conditions, though they can be expressively
correct (appropriate) or not. They plainly do not assert anything.46 They can
be analyzed as objects of type t c (or t s ), which simply express something
about the speaker’s mental states or what she takes the situation to be like.
44 We cannot, for instance, make use of the kind of binding tests that proponents of ‘unarticu-
lated constituents’ have taken as evidence for their approach (cf. Stanley 2000 for a use of
these tests, and Cappelen & Lepore 2005 for critical discussion).
45 The semantics of man is discussed in detail in McCready 2008b.
46 As the editors point out, this is so only if one does not accept relevant aspects of the
performative hypothesis, according to which (35c), for example, would assert something like
‘I hereby express ‘ouch!” Discussion of the hypothesis with arguments for and against it can
be found in Levinson 1983.
8:34
Varieties of conventional implicature
(37) a. Situation 1: You stub your toe on the curb while walking down
the street with your friend Curly.
b. Situation 2: Your friend Curly suddenly pokes you in the eye with
a fork.
(38) a. Ouch!
b. Ouch, man!
8:35
Eric McCready
McCready & Schwager (2009) analyze uses like these as expressing that the
speaker has maximal epistemic commitment to her justification for her use
of the modified proposition, so (39a) would express that the speaker is
maximally committed to her justification (evidence) that John came to the
party. It turns out that these modifiers can also modify purely expressive
items in some dialects of English.
On the McCready and Schwager analysis, this would express that the speaker
has maximal commitment to her justification for uttering ouch, itself an
expressive item. Presumably such justification would be a pain felt by the
speaker or something similar. But the main point for our purposes here is
that ouch is a bearer of purely expressive content. A proper analysis of cases
like these therefore will, again, require modification of expressive content.
We have now seen that there are instances in which purely expressive
content is modified. This means that we must add to the system a provision
for operators that take CIE content as input. But what type of content
should this be? The worry is that, if we allow operators over CI types (σ c ),
the generalizations made by Potts (i.a.) about modification of conventional
implicatures such as the content of appositives are lost. The natural way to
avoid this problem is to analyze man and totally in (39) as operators over
shunting typed objects, so to make them of type ht s , t s i.48 Such types are
47 I believe this follows from the analysis of sentence-final man given in McCready 2008b, on
which it performs a dynamic strengthening of speech acts, though I will not provide details
here.
48 Of course, there is also a need for a typing for these operators that allows them to modify
at-issue content as well: ht a , t s i. Depending on the facts about modification of CIE content,
8:36
Varieties of conventional implicature
easily added to the system (via clause (i) of B.1.1). With this move the Potts
generalizations are maintained in the type system.
I believe that the particles, and particularly the expressives like (35), are
the clearest instances of sentences which lack at-issue content, and, perhaps
as a consequence, are the instances which have received the most attention
in the literature. Let us now turn to another kind of sentence that does not
appear to have at-issue content.
3.3 Yokumo
The second example we will consider are sentences modified by the Japanese
adverbial yokumo. In line with McCready 2004, I will argue that yokumo
introduces three pieces of content: a) a statement of the speaker’s emotional
attitude toward the modified proposition ϕ, b) a statement regarding the
prior probability the speaker assigned to ϕ, and c) a condition on mutual
knowledge of ϕ. Unlike McCready 2004, however, I will analyze conditions
(a) and (b) as conventionally implicated rather than asserted, for reasons
which will become clear. The question of the status of (c) is more difficult to
resolve, but in the end I will conclude that it is presuppositional.
The meaning of yokumo is complex, as may already be clear from the brief
discussion above. Here are some representative examples, with somewhat
rough translations.49
8:37
Eric McCready
8:38
Varieties of conventional implicature
This example can be taken to indicate that yokumo cannot provide new
information. In my earlier work I modeled this knowledge requirement via
a condition on update: update is only defined if both hearer and speaker
already know the content of the proposition, in conjunction with an assump-
tion of common knowledge. There are several options regarding how this
condition should be stated. On the one hand, it is possible to simply pre-
suppose that CG{s,h} (ϕ), that ϕ is common ground for speaker and hearer;52
on the other hand, taking a less interactive approach to the dynamics of
information, we can simply stipulate that an update with yokumo(p) is only
defined if update with p does not alter the information state of speaker or
hearer. These two conditions amount to the same thing for present pur-
poses.53 I will make use of the former method in this paper.54 We arrive at
the following lexical entry.55
52 See van Ditmarsch, van der Hoek & Kooi (2007) for the semantics of this operator.
53 We do not need to concern ourselves with deep questions about the difference between
knowledge and belief here, for instance.
54 In McCready 2004, I took the second route. This decision was partly motivated by the fact
that the particle na can induce felicity, which I took to mean that it can help introduce
content into the common ground. Since I will not consider the action of this particle in this
paper, we can avoid detailed discussion of common ground and update. In any case, it may
well turn out that na has a different function that makes sentences modified by it compatible
with yokumo (McCready, in preparation).
55 One might think that all this is unnecessary, given that surprise(φ) is factive, if we assume
that the logical predicate has the same interpretation as the natural language surprise, which
I see no reason to do. But even if it is presupposed that φ, must we take φ to be common
knowledge? The answer is yes. First, note that what is presupposed by surprise(φ) is not
φ but that the speaker (believes herself to have) learned φ at some past time, which is
already the wrong interpretation. Further, this presupposition should be accommodatable;
but it is not. This is surprising given the results of Kaufmann (2009), who shows that
such presuppositions should be readily accommodatable, unlike presuppositions about
the common ground. I take this to indicate that the presupposition of common ground is
needed.
8:39
Eric McCready
Each of the possible denials in (46) is infelicitous. One might try to explain
this in terms of ‘privileged content’ or speaker relativity; it is known that it is
difficult to make claims about the truth or falsity of claims that depend (in
part) on the speaker’s preferences (cf. Lasersohn 2005; Stephenson 2007).
It makes some sense, given this, that the emotive content of the adverbial
content is hard to deny. But this argument does not go through for the
probability statement.57
The analysis starts with the observation that it is not actually impossi-
ble to deny the content of the adverbial — it just cannot be done with the
responses in (46). Less direct expressions are needed.
a. Chigau yo!
wrong PT
56 Here we suppose that it is known that the referent of ‘he’ is marrying Dallas.
57 If probabilities are understood as subjective, the basis for assertion may indeed be hard to
deny. But it seems clear that statements about likelihood become part of the public domain
once made, so denial of the surprise clause in the denotation of yokumo is surely possible.
8:40
Varieties of conventional implicature
‘That’s wrong!’
b. Sonna koto nai yo!
that-kind-of thing Cop.Neg PT
‘That’s not right.’
These facts are reminiscent of facts noted by Potts (2005) about conventional
implicatures. How can one call the content of a nominal appositive into
question, given that it cannot be denied directly?
What I will call truth-directed denials like those in (46) cannot target conven-
tionally implicated content, but only asserted content. Denials like (47) can
target either type of content. If we assume that the content of yokumo is con-
ventionally implicated, the facts in (46) are therefore immediately explained.
Note that the fact that truth-directed denial can target the asserted content
in (48) and not in (46) has an immediate explanation: (48) asserts that Bill
is rich, but (46) asserts nothing at all, for it is already common ground that
Dallas and Austin got married.58
58 Another commonality can be found with denials. Note that there are two parts to the
‘deniable’ content of yokumo sentences, given that the proposition modified is already part
of the common ground: the emotive content and the statement of surprise. For many (but
not all) speakers, the denials of yokumo-modified sentences in (47) can only target one of
these, meaning that they can deny the good/badness of the marriage, or its surprisingness,
but not both. The same seems to hold for sentences in English where multiple conventional
implicatures are tied to the same host NP, as in (ia). Here, the denial in (ib) seems to indicate
that either a) John is not a banker, or b) that he does not own a large house. It is difficult
to understand (ib) as denying both together. If this data is correct, the identification of the
content introduced by yokumo as conventional implicature receives additional support.
However, none of this follows from the analysis I am going to provide in terms of L+ CI ,
where the adverbial simply introduces a conjunction; unless it is assumed that only a
single conjunct can be targeted by a denial in the case of conventionally implicated content.
Formally, we might take the adverbial to introduce several distinct conditions, for example
8:41
Eric McCready
8:42
Varieties of conventional implicature
3.4 Conclusion
In this section we have seen several areas in which natural language appears
to make use of the possibilities afforded by shunting types, and have also
had occasion to slightly extend L+ CI to allow for modification of shunting
typed objects. I hope the reader has been convinced of their usefulness. I
do not think that this discussion exhausts the utility of shunting types: for
example, one other area where I think they could be useful is in the analysis
of exclamatives, which have the combinatory properties one would expect
from shunting-typed objects in terms of further combinatorics, given certain
61 This argument seems reasonable, but the presupposition that the modified proposition
is in the common ground is less simple to get clear about. How can we be sure that
presuppositions of this sort, that have no real equivalent in non-technical natural language,
are not actually conventionally implicated? I do not know of a really good way. The issue
is general, and has received a bit of recent discussion by Schlenker (2008), who raises
worries for his theory of presupposition involving complex presuppositions that cannot be
articulated easily or at all in natural language. This is an interesting issue but a difficult one,
and I will not be able to do it full justice in this paper.
62 Another way to interpret these results is to conclude that yokumo introduces a different
kind of content, that behaves in some ways similarly to CIE content (cf. the comments of a
reviewer). This seems possible; but it also seems that, even in this case, it behaves like CIE
content where it can appear. I think this justifies using the present system to analyze it.
8:43
Eric McCready
assumptions.63 They also exhibit semantic similarities with yokumo and even
the modifications done by particles, which suggest a larger correspondence.
The topic is large enough that I cannot do justice to it here. Another area is
expressive small clauses, sentential phrases like (49), discussed by Potts &
Roeper (2006).
Utterances like this one do not exhibit any at-issue content; there is nothing
for truth-directed denials to target, for example. This fact makes it look like
shunting types should be involved. As Potts and Roeper state, though, it is
not completely clear how the details of the composition should work, and I
cannot improve on their observations here.
In a sense, the conventional implicatures introduced by shunting-typed
content remain supplementary, at least in the cases examined here; the dif-
ference with ‘ordinary’ conventional implicatures of CI type is that shunting-
typed objects supplement content that is already present, and not asserted
by the sentence providing the supplementary information. In the case of
yokumo, this content must be introduced via accommodation, if it is not
already present; but this presents no special difficulties, unlike presupposi-
tions of some kinds of expressive content (e.g. Kaufmann 2009). For some
other instances of CIE content in contexts where no assertion is made, the
situation can be different, for instance in the analysis of the Japanese modal
particle daroo provided by Hara (2008). According to this analysis, daroo(ϕ)
conventionally implicates that µ(ϕ) > 50%, but does not assert anything.
Hara notes that LCI is not appropriate for analyzing this case, in that, given
that this type system returns ϕ itself in the at-issue dimension, Gricean
maxims would be violated by any use of daroo to modify a proposition. L+S CI ,
however, makes the right predictions (assuming that the Hara analysis is cor-
rect.) What these cases have in common is that the conventionally implicated
content is, in some sense, primary to the intent behind the utterance.
63 For instance, one must say something about ‘embedded exclamatives.’ One possible route
is to note that embedded instances of exclamatives show very different behavior from
non-embedded instances, a fact already noted by Rett (2008), who draws a sharp distinction
between the two types.
8:44
Varieties of conventional implicature
Let us now examine a single phenomenon (or group of phenomena) that seems
to make use of all the types of content discussed here. This is the system
of Quechua evidentials, for which L+ CI can provide an alternate analysis to
the proposal of Faller (2002), on which these evidentials modify speech acts.
I will begin by giving the basic background and facts that a theory of the
evidentials should explain. I then briefly present Faller’s speech act-based
analysis and show (following McCready 2008a) that, despite the conventional
implicature-like behavior of the evidentials, an adequate analysis cannot
be given in LCI . I then show that such an analysis is available in L+ CI . The
intent is to duplicate the basics of Faller’s analysis as closely as possible in a
conventional implicature-based system which does not make use of speech
acts. I should make two caveats before embarking on this project. First, the
proposal I make here does not account for many of the subtle issues that
arise in the Quechua evidential system, only the most basic, brutal facts about
the way in which composition seems to work for the different evidentials
in the language.64 Second, the analysis of Faller (2002) is by no means the
last word on this subject. More recent work by Faller (2003, 2007, 2006)
introduces additional complexities, which I will also leave aside. This section
should therefore be taken as only a sketch of an alternate analysis, in which
we see how one can ensure some kinds of scope behavior without making
anything other than lexical stipulations about types of content.
Cuzco Quechua has several enclitic suffixes that mark evidentiality:
roughly, the nature of the speaker’s justification for the claim made by
the utterance. Faller analyzes three suffixes in detail. The first is the direct
evidential -mi, which indicates that the speaker has the best available grounds
for the claim made, which generally amounts to perceptual evidence. The
second, -si, is a hearsay evidential which indicates that the speaker heard
the information expressed in the claim from someone else. Finally, -chá, an
inferential evidential, indicates that the speaker’s background knowledge,
plus inferencing, provides evidence for the proposition the modified sentence
denotes, and asserts that the sentence might be true.
(50) a. Para-sha-n-mi
rain-Prog-3-mi
64 I also restrict attention to assertions; complex issues arise with questioning evidentials in
this language, which I am not sure how should best be addressed.
8:45
Eric McCready
A final basic fact that a theory of evidentials in this language must explain
is that use of the hearsay evidential with a sentence does not commit the
speaker to the content of the sentence. For instance, the first clause of the
following sentence does not commit the speaker to the proposition that a lot
of money was left for the speaker, as the continuation shows.
Thus, roughly, what is needed is the following result, where the evidential
content is not asserted:
8:46
Varieties of conventional implicature
Faller uses Vanderveken’s (1990) speech act theory for her analysis. This
theory, like other theories of speech acts, assigns them preconditions for
successful performance. Faller takes evidentials to introduce additional
content into the set of preconditions. For the cases under consideration, we
need only be concerned with one kind of precondition: sincerity conditions
on successful performance of the speech act. For assertions, Vanderveken
takes it to be necessary that Bel(s, p) holds — that the speaker believes the
content of the assertion.65
Most of the action in Faller’s analysis of -mi and chá is in the sincerity
conditions for the assertion. On her analysis, -mi adds an additional sincerity
condition to the assertion, that Bpg(s, φ). The formula Bpg(s, φ) means that
the speaker has the best possible grounds for believing φ. It is very difficult
to make this condition precise. Faller notes that what counts as best possible
grounds is dependent on the content in the scope of -mi: for externally visible
events Bpg will ordinarily be sensory evidence, while for reports of people’s
intentions or attitudes even hearsay evidence will often be enough.
Faller analyzes -chá as being simultaneously modal and evidential. The
asserted content is therefore ♦φ when φ is modified by -chá; the correspond-
ing sincerity condition also involves ♦φ instead of φ. A sincerity condition
indicating that the speaker’s reasoning has led him to believe that φ might
be possible is also introduced. The hearsay evidential -si is also complex; the
propositional content p is not asserted when this hearsay evidential is used,
as we saw, which means that the propositional content of the utterance can-
not be asserted. Faller posits a special speech act present for this situation,
on which the speaker simply presents a proposition without making claims
about its truth. In addition, the sincerity condition requiring that the speaker
believe φ is eliminated, and a condition stating that the speaker learned φ
by hearsay is added.
While considering the degree to which the semantics of evidentials can
be viewed as homogeneous, McCready (2008a) attempted to provide a con-
ventional implicature-based analysis of the Quechua system. It seems plain
that the evidentials of this language behave in a way similar to conventional
65 This is only a very rough approximation of the normative conditions on assertion. See e.g.
Searle 1969 and Siebel 2003 for discussion.
8:47
Eric McCready
8:48
Varieties of conventional implicature
These are precisely the desired results. This sketch of an analysis for the
Quechua evidential case thus provides an example of a situation in which the
full power of L+CI is needed to analyze a single linguistic phenomenon. Of
course, the question of whether this analysis or Faller’s speech act-based one
is to be preferred for this case is separate, and depends on working out the
details of the conventional implicature story in connection with looking at
a wider array of more complex data. Still, at minimum, the discussion here
shows that a speech act analysis is not the only possibility for the phenomena
in question.
5 Conclusion
This paper has made two major contributions. It has distinguished and pro-
vided a logical system for the analysis of three distinct types of conventional
implicature: supplementary CIEs as modeled in Potts 2005, CIEs that provide
main content, analyzed in L+S CI as being of shunting type, and mixed CIEs, an-
+
alyzed in LCI . This typology is novel and is one that I think helps significantly
in understanding CIE phenomena. I doubt it is exhaustive, however. It seems
possible that the three categories analyzed need further subdivision, even
in terms of their typing (there is obvious need for subdivision in terms of
content). I believe that these systems will be useful for researchers working
to understand the range of conventional implicature in the world’s languages;
I hope the above discussion has provided some support for this belief. In
the process, the paper has analyzed a number of phenomena involving CIE
content, mostly of mixed or shunting type: these analyses are the second
contribution of the paper.
One question that has not been addressed in any detail is the nature of
the distinction between conventional implicature and expressive content,
or even if there is any empirical distinction. I think that, in terms of their
combinatorics, there might well not be any difference. The two show a similar
lack of interaction with most kinds of semantic operators (embedding under
attitudes being a significant exception), which suggests that they act similarly
in terms of compositional semantics. At the present moment, there has
not been sufficient empirical investigation for this point to be really clear.
8:49
Eric McCready
8:50
Varieties of conventional implicature
ii. Further, let x serve as a variable over {e, t, s} and let σ and τ serve as
variables over well-formed types with their superscripts stripped off.
The type-superscript abbreviator is defined as follows:
xa xa
xc xc
hσ a , τ a i hσ , τia
hσ a , τ c i hσ , τic
8:51
Eric McCready
iii. All instances of ‘LCI ’ in the LCI type specification are replaced
with ‘L+S
CI ’.
iv. The following two clauses are added to the definition of the
type-superscript abbreviator :
xs xs
hσ a , τ s i hσ , τis
• This type definition, bundled with the LCI rules (R1-6), the newly
defined rule (R7), and the revised interpretation mechanism in (32),
comprises L+S
CI .
(i) If σ andτ are at-issue types for L+ CI , and ζ and υ are shunting
+
types for LCI , then σ × ζ, hσ , τi × ζ, σ × hτ, ζi and σ × hζ, υi
are mixed types for L+ CI .
69 Comment: It is not necessary to use most of the types produced by clause (i) for the analyses
made in the present paper. However, I will make such types available in the logic: I do not
think it wise to restrict the type system too much in view of our limited current knowledge of
the range of mixed type expressions in natural language. Here I in effect follow the practice
of LCI , where a wide range of CI types is made available, although in practice only a narrow
range of them ends up being used.
8:52
Varieties of conventional implicature
• This type definition, together with the LCI rules (R1-7) and the new
rules (R8,9) and the interpretation rule (32), comprise L+
CI .
References
Amaral, Patricia, Craige Roberts & E. Allyn Smith. 2008. Review of ‘The
logic of conventional implicatures’ by Christopher Potts. Linguistics and
Philosophy 30(6). 707–749. doi:10.1007/s10988-008-9025-2.
Asher, Nicholas. 2000. Truth conditional discourse semantics for parentheti-
cals. Journal of Semantics 17(1). 31–50. doi:10.1093/jos/17.1.31.
Asher, Nicholas & Alex Lascarides. 2003. Logics of conversation. Cambridge:
Cambridge University Press.
Asher, Nicholas & James Pustejovsky. 2005. Word meaning and
commonsense metaphysics. Ms., University of Texas Austin and
Brandeis University. http://semanticsarchive.net/Archive/TgxMDNkM/
asher-pustejovsky-wordmeaning.pdf.
Bach, Kent. 1999. The myth of conventional implicature. Linguistics and
Philosophy 22(4). 327–366. doi:10.1023/A:1005466020243.
Bach, Kent. 2006. Review of Christopher Potts, ‘The logic of con-
ventional implicatures’. Journal of Linguistics 42(2). 490–495.
doi:10.1017/S0022226706304094.
Barker, Chris & Pauline Jacobson. 2007. Direct compositionality. Oxford:
Oxford University Press.
Cappelen, Herman & Ernest Lepore. 2005. Insensitive semantics. Oxford:
Blackwell.
Carpenter, Bob. 1998. Type-logical semantics. Cambridge, MA: MIT Press.
Chierchia, Gennaro. 1998. Reference to kinds across language. Natural
Language Semantics 6(4). 339–405. doi:10.1023/A:1008324218506.
van Ditmarsch, Hans, Wiebe van der Hoek & Barteld Kooi. 2007. Dynamic
epistemic logic. Berlin: Springer.
Dummett, Michael. 1973. Frege: Philosophy of language. London: Duckworth.
Faller, Martina. 2002. Semantics and pragmatics of evidentials in Cuzco
Quechua. Stanford, CA: Stanford University dissertation.
Faller, Martina. 2003. Propositional- and illocutionary-level evidentiality in
Cuzco Quechua. In Jan Anderssen, Paula Menendez-Benito & Adam Werle
8:53
Eric McCready
8:54
Varieties of conventional implicature
8:55
Eric McCready
8:56
Varieties of conventional implicature
Eric McCready
Department of English
Aoyama Gakuin University
4-4-25 Shibuya
Shibuya-ku, Tokyo 150-8366
mccready@cl.aoyama.ac.jp
8:57
Semantics & Pragmatics Volume 3, Article 5: 1–15, 2010
doi: 10.3765/sp.3.5
Embedded Implicatures?
Remarks on the debate between globalist and localist theories
Michela Ippolito
University of Toronto
1 Introduction
The implicature that (1) generates — the localist maintains — is that John
thinks that Fred heard some but not all of Verdi’s operas. Assuming a localist
view according to which implicatures are triggered by means of a silent
exhaustive operator O as in Chierchia et al. 2008, the embedded implicature
in (1) is triggered when O adjoins the embedded clause as shown in (2).1 This
gives rise to the meaning in (3).
(3) John thinks that Fred heard some of Verdi’s operas and John thinks
that Fred didn’t hear all of Verdi’s operas
5:2
Embedded Implicatures?
To test whether the predictions made by the local view of implicatures are
correct, Geurts and Pouscoulous looked at different types of embeddings. In
their first experiment, they considered complex sentences where the scalar
item some is embedded in the nuclear scope of the universal quantifier all;
under a modal verb with a universal force; in the complement of think; and
finally in the complement of want. They compared the results they obtained
in these cases with the rate of implicatures drawn in unembedded clauses and
found that, while scalar implicatures were accepted in the majority of simple
(unembedded) cases, the acceptance rate was much lower in the complex
conditions (with differences among conditions; see section 3.1 below). The ex-
periment used an inference task in which participants were shown a sentence
containing a scalar expression (e.g. some) and were asked whether they would
infer that the corresponding sentence with the stronger scalar expression
(e.g. all) was false. In a subsequent experiment, the authors compared the
rate of local implicatures found using the inference task with the rate of
local implicatures found using a verification task in which participants where
shown a sentence containing a scalar expression and were asked to decide
whether that sentence correctly described a picture that they were shown.
The result of the latter experiment when applied to unembedded clauses
showed that the inference paradigm yields higher rates of scalar implicatures
than the verification paradigm, and therefore that the verification task is a
more reliable way to find out the rate at which people actually draw scalar
implicatures. When applied to the question of whether local implicatures are
drawn in embedded clauses, the verification task performed by Geurts and
Pouscoulous “completely failed to yield the local SIs predicted by mainstream
conventionalism” (Geurts & Pouscoulous 2009). In particular, the authors
tested scalar items (here, some) embedded in downward-entailing (DE) con-
texts (i.e. Not all the squares are connected with some of the circles); scalar
items embedded in upward-embedding (UE) contexts (i.e. All the squares are
connected with some of the circles); and finally, scalar items embedded in
non-monotonic (NM) contexts (i.e. There are exactly two squares that are
connected with some of the circles).
In the next section, taking Chierchia et al. 2008 to be a paradigmatic
example of a localist theory, I will spell out in more detail how this theory
works and I will consider the consequences of Geurts and Pouscoulous’s
experimental results, particularly with respect to the issue of the frequency
with which embedded implicatures are drawn.
5:3
Michela Ippolito
3 Discussion
(4) If you take salad or dessert, you pay $20; but if you take both there is
a surcharge.
(5) Exactly two students wrote a paper or ran an experiment. The others
either did both or made a class presentation.
(6) Mary solved some or all of the problems.
Take (4). Chierchia et al. (2008) argue that, while implicatures are not nor-
mally triggered in the antecedent of conditionals (a DE environment), the
continuation in (4) forces an exclusive interpretation of or in the antecedent
(that is an interpretation of the antecedent strengthened with the scalar
implicature “but not both”) as the only way to guarantee a coherent interpre-
tation for the discourse. Embedding the exhautive operator in the antecedent
guarantees that such an interpretation is generated.3
Someone might initially object to Chierchia et al.’s (2008) argument that,
if embedded and non-embedded implicatures are generated by the same
mechanism–in this case the exhaustive operator O–then you would not expect
local implicatures to be confined to this very special set of cases. The fact that
local implicatures seem to be confined to a very narrow set of cases, and that
occurrences of scalar items (such as some or or) in embedded positions do
not normally trigger local implicatures raises the suspicion that the “effect”
3 Similarly for (5) and (6). In (5), the continuation is argued to force the embedded implicature
giving rise to the interpretation according to which ‘exactly two students wrote a paper or
ran an experiment but didn’t do both’. The continuation in (6) is also argued to force the
embedded implicature so that as a result the interpretation of the sentence will be that
either Mary solved some but not all of the problems or she solved all of them.
5:4
Embedded Implicatures?
(8) SMH2:
Let S be a sentence of the form [S . . . O(X) . . . ]. Let S 0 be the sentence
of the form [S 0 . . . X . . . ], i.e. the one that is derived from S by replacing
O(X) by X, i.e. by eliminating this particular occurrence of O. Then
everything else being equal, S 0 is preferred to S if S 0 is logically
stronger than S.
According to SMH1, given a certain logical form, all LFs differing in where the
exhaustivity operator occurs will compete with each other and the strongest
LF will be preferred. According to SMH2, alternative LFs differing in the
placement of the exhaustive operator do not compete with each other but
only with the LF without the operator. Taking Chierchia et al.’s (2008) theory
as the paradigmatic localist theory, the question that arises is whether the
localist theory sketched above supplemented with either SMH1 or SMH2 can
be reconciled with Geurts and Pouscoulous’s experimental results.
4 This might explain why, while focal stress is often needed to bring out the embedded
implicature interpretation, focal stress is not needed to bring out the non-local implicature
interpretation. Chierchia et al. (2008) attribute the fact that focal stress helps the embedded
implicature reading of the sentences they consider to the nature of the mechanism they
appeal to, i.e. covert exhaustification, which is triggered by focus. However covert exhausti-
fication is also supposed to be responsible for the non-local implicature raising the question
why focal stress is a relevant factor in the explanation of one type of implicature but not in
the explanation of the other.
5:5
Michela Ippolito
5:6
Embedded Implicatures?
(10) John wishes that Fred would try some of the cookies.
a. John wishes that O(Fred would try some of the cookies)
b. O(John wishes that Fred would try some of the cookies)
The configuration in (10a) triggers the embedded implicature that John wishes
that Fred would not try all of the cookies. In (10b), on the other hand, the
implicature is that John doesn’t wish that Fred would try all of the cookies.
Consider first the prediction made by SMH1. Just like in the previous example,
(10)’s assertion supplemented with the embedded implicature in (10a) gives
rise to a meaning stronger than the meaning obtained by incrementing the
same assertion with the implicature in (10b): if John’s desire-worlds are all
worlds where Fred tries some but not all of the cookies, then it is not the
case that all of John’s desire-worlds are worlds where Fred tries all of the
cookies. However, the reverse does not hold: the assertion together with
(10b) is compatible with a state of affairs where in some of John’s desire-
worlds Fred tries all of the cookies, a possibility ruled out by the implicature
in (10a). Therefore, (10a) is predicted to be the preferred reading of the
sentence in (10) by SMH1. One of the conditions that Geurts and Pouscoulous
tested in one of their experiments was embedding of a scalar item under
want and they found that the embedded implicature reading was not the
preferred interpretation of the sentence. If their results can be extended to
any volitional verb, including wish, they show that the prediction made by
SMH1 is not correct. Similarly for SMH2: in this case, both (10a) and (10b) are
predicted to deliver meanings stronger than the meaning obtained without
O and so the two strengthened interpretations are incorrectly predicted
to be equally available. This is so unless some independent contextual
consideration rules out (10a), but as we observed above the context plays no
role in Geurts and Pouscoulous’s experiment and therefore we do not expect
it to be a factor affecting the subjects’s judgments.
In conclusion, even when supplemented with a formal mechanism for
predicting when an embedded implicature will be preferred or dispreferred,
Chierchia et al.’s (2008) localist theory fails to account for the fact that
embedded implicatures are systematically dispreferred. Appealing to contex-
tual/plausibility considerations in order to override the outcome of the theory
is problematic since in Geurts and Pouscoulous’s experiments judgments
were elicited out-of-context.
5:7
Michela Ippolito
In the next section, I will look at the exceptional behavior of the verb
believe, for which Geurts and Pouscoulous found a higher acceptance rate for
the embedded implicature than in any other complex condition. Even though
the exceptional behavior of believe initially appears to support a localist
theory, I will conclude that it actually constitutes another challenge for it.
According to the classification in Horn 1978, want is also NR. It follows that
Chierchia et al.’s (2008) localist theory predicts that both (12a) and (12b)
should be equally available. However, the rate of acceptance of the embedded
6 Geurts and Pouscoulous’s sentence was given in (1). (1) is a translation of the French sentence
actually used in the experiment.
5:8
Embedded Implicatures?
implicature with the modal verb want was low (32%), lower than what they
found in the believe case. The embedded implicature reading is dispreferred,
and nothing in how the exhaustive operator O or SMH work seems to explain
why the embedded implicature is more frequently accepted with believe than
with want.
Geurts and Pouscoulous, following the lines of van Rooij & Schulz 2004
and Russell 2006, sketch a globalist account for why believe shows a higher
acceptance of the embedded implicature: (i) the sentence Bob believes that
Anna ate some of the cookies generates the global implicature that Bob doesn’t
believe that Anna ate all of the cookies; (ii) assuming that Bob has an opinion
about whether Anna ate all of the cookies or not, it follows that Bob believes
that Anna did not eat all of the cookies. Now, in their paper defending a
localist view of implicatures, Gajewski & Sharvit (2009) criticize this type of
globalist account by arguing that appealing to the disjunctive proposition
“either Bob believes that Anna ate all of the cookies or he believes that she
didn’t” in the reasoning above is only plausible because believe is a NR verb
and as such it carries the presupposition that either α believes that ϕ or α
believes that it is not the case that ϕ (as argued in Gajewski 2005). In other
words, according to Gajewski and Sharvit, the globalist account only appears
to work because the predicate is NR and the disjunctive proposition crucial
to the globalist explanation is actually presupposed by the verb. But if this
were correct, then all NR verbs would trigger an embedded implicature since
they all presuppose the relevant disjunctive proposition. But we just saw that
this is not so: the experimental results reported in Geurts and Pouscoulous
show that local implicatures with want are relatively rare, despite want being
a NR verb. A short digression on NR verbs is in order here. I have assumed
with Horn (1978) that want, like believe but unlike wish, is NR based on the
observation that in (13) but not in (14) the first sentence implies the second.
However, Rooryck (1991) cites the following pair from Horn 1978 against the
view that want/vouloir are NR verbs: while (15) supports the NR hypothesis,
(16) does not.
5:9
Michela Ippolito
Rooryck concludes that volitional verbs only appear to be NR but in fact they
are not. An exhaustive discussion of this issue is beyond the scope of this
paper. However, what is important in the context of the current discussion
about embedded implicatures is to notice that even if volitional verbs are not
NR, it is still true that in cases such as (15) vouloir behaves like a NR verb
in that the two sentences are judged to be synonymous, just like originally
observed by Horn. Just like in (15), the English rendition of the implicature
in (12b) (i.e. John doesn’t want Fred to try all of the cookies) is also judged to
have a NR interpretation, and so does the French translation with vouloir.7
Therefore, since the logical form in (12b) receives a NR interpretation, it is
expected to pattern like non-volitional NR verbs such as believe with respect
to the computation of the embedded implicatures, and the experimental
results show it does not.
Going back to the main discussion, obviously the globalist needs to
explain the asymmetry between want and believe too. In principle we should
be able to run the reasoning sketched by Geurts and Pouscoulous for believe
with want: (i) (12) generates the implicature that John doesn’t want Fred to
try all of the cookies; (ii) let us assume that John has a definite desire about
Fred’s trying all of the cookies, that is, that either John wants Fred to try all
of the cookies or he wants Fred not to try all of the cookies; (iii) it follows
that John wants Fred not to try all of the cookies. The crucial step is (ii).
What “blocks” (ii) in the want case but not in the believe case?
We saw that dismissing the globalist account by appealing to the pre-
suppositional nature of this disjunctive proposition is not going to work.
According to the Russellian line followed by Geurts and Pouscoulous, an
assumption like “either John wants Fred to eat all of the cookies or John
7 Thanks to Annick Morin for providing the French sentence Je (ne) veux pas que Marie mange
tous les biscuits and for her judgment.
5:10
Embedded Implicatures?
wants Fred not to eat all of the cookies” is purely contextual and as such
it will be part of the common ground in some contexts but not in others.
Whenever the context grants this assumption, the strengthening of the global
implicature happens, giving rise to an embedded implicature effect without
an actual embedded implicature. If this is correct, then the reason why
subjects assented to the local implicature less frequently in the want case
than in the believe case must have to do with how likely they felt they could
make the relevant disjunctive assumption. In particular, it must be the case
that, in the absence of any context, subjects felt that the assumption in (17a)
was less likely to be true than the assumption in (17b).
(17) a. Either John wants Fred to try all of the cookies or John wants Fred
not to try all of the cookies.
b. Either Bob believes that Anna ate all of the cookies or he believes
that she didn’t.
If it is the case that, out of context, people are less likely to make the
assumption in (17a) than the one in (17b), we expect that it should be much
easier to trigger the apparent local implicature with want if the context allows
one to do so. In the globalist theory, then, plausibility considerations such as
the ones outlined above might be expected to distinguish among other NR
verbs which the localist theory would predict pattern alike with respect to
embedded implicatures. 8
The localist too can appeal to the context (and Chierchia et al. (2008)
leave this door open explicitly in their paper), but appealing to the context
8 One pair of predicates that might be interesting to test experimentally is the pair expect/ought
to, as in John expects Mary to try some of the cookies and Mary ought to try some of the
cookies. According to Horn 1978, both predicates are NR. The localist theory predicts that
both should give rise to a high rate of acceptance of the embedded implicature (“John
expects Mary not to try all of the cookies” and “Mary ought to not try all of the cookies”,
respectively), at least out of context. The globalist theory, on the other hand, would have to
appeal to two different disjunctive propositions in order to strengthen the global implicature
giving rise to an embedded implicature effect.
(18) a. Either John expects Mary to try all of the cookies or John expects Mary not to try
all of the cookies.
b. Either Mary ought to try all of the cookies or Mary ought to not try all of the
cookies.
At least out-of-context, it seems that (18a) would be easier to assume. If indeed (18a) is
more plausible than (18b), then the globalist theory predicts that the acceptance rate for the
embedded implicature should be higher in the expect case than in the ought case.
5:11
Michela Ippolito
4 Conclusion
5:12
Embedded Implicatures?
References
5:13
Michela Ippolito
5:14
Embedded Implicatures?
Michela Ippolito
Department of Linguistics
University of Toronto
130 St. George Street
Toronto, ON M5S 3H1
Canada
michela.ippolito@utoronto.ca
5:15
Semantics & Pragmatics Volume 3, Article 7: 1–13, 2010
doi: 10.3765/sp.3.7
Abstract
Conventionalist theories of scalar implicature differ from other accounts
in that they predict strengthening of embedded scalar terms. Geurts &
Pouscoulous (2009a) argue that experimental support for this prediction is
largely based on sentence comprehension tasks that inflate the frequency
with which terms like some are strengthened. Using a picture verification
task, they observed no strengthening of embedded scalars. We present
data from a multiple-choice picture verification task that is more sensitive
to interpretation preferences, and find that readers do show a preference
for strengthened interpretations even in embedded phrases. These data
cast doubt on Geurts and Pouscoulous’s empirical arguments against the
existence of embedded implicatures.
1 Introduction
Geurts & Pouscoulous (2009a)1 present data arguing against what they call
“mainstream conventionalist” and “minimal conventionalist” accounts of the
strengthening of scalar terms like some. Both positions (see Chierchia, Fox &
∗ Acknowledgements: We thank Lyn Frazier for comments on an earlier version of our
manuscript. We thank Maria Bonilla and Morgan Mendes for their assistance in this research.
This project was supported in part by Grant Number HD18708 from NICHD to the University
of Massachusetts. The contents of this paper are solely the responsibility of the authors and
do not necessarily represent the official views of NICHD or NIH.
1 See Chemla 2009, and Geurts & Pouscoulous 2009b, for more discussion.
Spector 2008 for a survey; see Geurts & Pouscoulous 2009a for additional
references) claim that an “exclusivity” or O-operator is freely prefixed to
any S node with the result that a proposition containing some X, X or Y,
etc. is strengthened to ‘some but not all,’ exclusive ‘or’, etc. Mainstream
conventionalism claims that the strengthened interpretation is the preferred
interpretation, unless it occurs in a context (e.g. a downward-entailing con-
text) which results in a logically weaker global interpretation of the sentence
in which it occurs. Minimal conventionalism merely claims that the strength-
ened interpretation is possible, but says nothing about preference.
One way to evaluate conventionalist approaches is to examine ‘embedded
implicatures’ (or, following Geurts & Pouscoulous, ‘local scalar implicatures’).
Consider a sentence like (1) (Geurts & Pouscoulous’s (7a)):
Insertion of the exclusivity operator under the scope of all students entails
that all students read some but not all of Chierchia’s papers and thus that no
students read all of Chierchia’s papers. This should be the preferred reading
according to mainstream conventionalism, because it is a stronger (more
limited) claim than the non-strengthened claim. It is also a possible reading
according to minimal conventionalism. However, it is not a pragmatically
justified reading from a Gricean perspective. The author of the statement
presumably did not believe that all students read all of Chierchia’s papers
(else he would have said that). Thus, the pragmatically justified implication
of (1) is (2a). It is not (2b), which is entailed if the exclusivity operator is
inserted.
(2) a. It is not the case that all students read all of Chierchia’s papers.
b. All students read not all of Chierchia’s papers.
Geurts & Pouscoulous (2009a) argue that introspective evidence is not ad-
equate to decide what people usually do take sentences with scalar terms
to mean (an argument that is particularly persuasive when the theorist is
doing the introspecting). They present some very interesting ‘verification’
experiments which they claim disconfirm both flavors of conventionalism
(but are consistent with a construal of Gricean pragmatics). In these experi-
ments, a subject is shown a picture and asked whether a sentence containing
a scalar term ‘correctly describes’ the picture. Their subjects nearly univer-
sally accepted sentences as correctly describing pictures that a strengthened
7:2
Embedded implicatures observed
interpretation of the sentence was not true of. For instance, 100% of Geurts
and Pouscoulous’s subjects accepted the sentence in Figure 1 (from Geurts &
Pouscoulous 2009a) as correctly describing the arrangement shown in the
figure, even though the locally-strengthened interpretation (’all of the squares
are connected to some but not all of the circles’ and thus ‘none of the squares
are connected to all of the circles’) is false of the figure. They concluded,
on the basis of data like these that “the conventionalist approach to scalar
implicatures has little to recommend it” (Geurts & Pouscoulous 2009a, p 431).
true false
Geurts & Pouscoulous (2009a) acknowledged that data they obtained in verbal
“inference” tasks (in which subjects are asked whether a sentence like All
the squares are connected with some of the circles implies All the squares
are connected with some but not all of the circles) exhibited a fair proportion
(on the order of 50%) of strengthened interpretations. However, they state
that such data are suspect. They argue that the proportion of acceptances of
strengthened interpretations is inflated, perhaps because subjects’ attention
is called to the putative implication, so that subjects confuse it with the
legitimate non-embedded Gricean implicature (The square is connected with
some of the circles pragmatically implicates The square is connected with
some but not all of the circles).
We were concerned that the verification task used by Geurts & Pous-
coulous (2009a) has its own bias. Displays like that in Figure 1 can be
7:3
Clifton & Dube
correctly described in many ways: There are squares and circles; Squares
and circles are connected to each other; Some squares are connected to some
circles; etc. A pragmatic perspective does not require that only the strongest
interpretation is a correct description, even if it is the preferred description.
Similarly, while a mainstream conventionalist perspective claims that the
preferred (strengthened) interpretation is not strictly true of the display,
the existence of various weaker but legitimate descriptions of the display
suggests that the non-strengthened interpretation may be acceptable. It
may be that the locally-strengthened interpretation is considered to be the
best interpretation of the sentence, as long as it is the globally-strongest
interpretation. However, Geurts and Pouscoulous’s subjects were not asked
whether the display was the best possible depiction of the target sentence.
They were only asked whether the sentence correctly described the display.
A variety of weaker statements and interpretations can still be considered to
be correct descriptions of the display.
From this perspective, it is tempting to consider what would happen if
the subject were given a choice between two displays, one of which honors
the locally-strengthened interpretation and the other of which violates it. If
the locally-strengthened interpretation is the preferred one (as claimed by
the mainstream conventionalist position), subjects should choose the display
that honors it rather than the one that does not. If minimal conventionalism
is on the right path, then subjects should be equally happy choosing either
display. And the same should be true if Gricean pragmatics rules the day:
the proper interpretation should be ‘All the squares are connected to some
and possibly all of the circles.’
We conducted two experiments, modeled on Geurts & Pouscoulous’s
(2009a) Experiments 2 and 3. In each case, we shifted from a verification
format to a choice format. Subjects were shown a sentence and two figures
(generally one honoring a locally-strengthened interpretation, one honoring
only a basic interpretation; see below for details), and asked to choose which
picture was best described by the sentence: the ‘strengthened’ picture, the
‘basic’ picture, “both,” and “neither.” Both experiments were conducted in
a single session, with randomly intermixed presentation of items including
filler items, as described below.
7:4
Embedded implicatures observed
2 Experiment 1
2.1 Materials
7:5
Clifton & Dube
present only the some verification data here, for comparability with Geurts
and Pouscoulous.
7:6
Embedded implicatures observed
diagram (see Figure 2). Subjects made the verification response via key-press.
No time constraint was imposed on the subjects, and participation in the
study took approximately 20 minutes.
Table 1 contains the percentages of choices of each of the four options. The
results are very clear. There was a preponderance of choices of the ‘B’ pair
of boxes, in which some but not all of the named items (e.g., stars) were
on the left; there were more choices of B than A: t(35) = 9.8, p < .001, 95%
CI of difference: (.54, .82). This, of course, is the choice that is consistent
with a strengthened interpretation. Choices of ‘both,’ consistent with a non-
strengthened ‘some and possibly all,’ were fairly infrequent and failed to
rise above the arguable chance level of .25 choices of a given option, t(35)
= .13, p = .90, 95% CI : (.13, .35). Choices of the A picture (which Geurts &
Pouscoulous’s subjects accepted 66% of the time) and the ‘neither’ item were
essentially non-existent.
Choice Option
A B* C (“both”) D (“neither”)
7:7
Clifton & Dube
3 Experiment 2
3.1 Materials
Four sentences were constructed that contained the scalar some. They were
written in two versions each, as illustrated in (4), one with the universal
quantifier all and the other with each.2 Both forms involve embedded impli-
catures, and do not support scalar implicatures from a Gricean perspective.
Each of the four items referred to a different triple of shapes.
2 This manipulation was included based on the intuition – which proved to be incorrect –
that the more individuating nature of each compared to all would discourage a ‘group’
interpretation of the predicate and encourage strengthening.
7:8
Embedded implicatures observed
Two different figures, each with two designs, were made up for each of the
four items. An illustration appears in Figure 3. One figure (top panel in
Figure 3, Version 1) contained one design that honored the strengthened
interpretation (the B item) and one design that honored the unstrengthened
‘all’ interpretation. The predictions for these items were laid out earlier. The
other figure (bottom panel, Version 2) was designed so that neither design
was true of the strengthened interpretation. For these items, a reader who
arrived at that interpretation (i.e., a reader who made a local or embedded
implicature) should choose Option D, ‘neither.’ A reader who did not take the
strengthened interpretation should find either display acceptable and ideally
choose Option C, ‘both.’
Since they were conducted together, details regarding the subjects and pro-
cedures for Experiment 2 are identical to those of Experiment 1, with the
exception that each subject received 8 critical trials. Each subject saw all four
sentences twice, once where one figure honored the strengthened interpreta-
tion (Figure 3, Version 1) and once where neither figure did (Figure 3, Version
2). Two of each of these had the quantifier all and two, each, counterbalanced
over subjects so that each item was tested with each quantifier equally often.
Apart from this variation, trials differed only in the particular forms used
(circles, triangles, stars, moons, hearts, etc.)
7:9
Clifton & Dube
7:10
Embedded implicatures observed
Choice Option
4 Conclusions
7:11
Clifton & Dube
between the two types of figures (and further, that they showed a smaller but
still substantial frequency of rejecting both figures when neither honored
strengthening). We submit that Geurts and Pouscoulous’s conclusion that
readers do not make embedded implicatures is based on suspect data, and
hence is at best premature.
Theoretically, though, the cup may be only half full. While our data
show that readers who make the choice between the strengthened and the
unstrengthened interpretation of an embedded scalar strongly prefer the
former, they also show that the most common response is not to choose
between the interpretations but to accept both. Such ecumenism is not a
given; Experiment 1, which tested non-embedded scalar terms, found that
“both” choices were fairly infrequent. The choice of “both” in Experiment
2 presumably reflects the absence of strengthening. Perhaps the right con-
clusion is that an apparently strengthened interpretation of an embedded
scalar term like some is possible, but not obligatory and not even preferred.
This conclusion may present some difficulty to one who holds a pragmatic
Gricean perspective. As Geurts & Pouscoulous (2009a) make clear, Gricean
accounts of strengthening of scalar terms under the scope of (e.g.) think
and believe (Geurts 2009) do not readily generalize to scalar terms under
the scope of all or each. In the absence of a Gricean account of pragmatic
strengthening under the scope of such terms, our results call Gricean ac-
counts generally into question. Similarly, our findings may present some
difficulty for a mainstream conventionalist perspective: It is not clear from
such a perspective why the strengthened interpretation is apparently taken
less frequently than the basic interpretation. The minimal conventionalist
perspective discussed by Geurts & Pouscoulous (2009a) can accommodate
our data, as can a perspective that says that terms like some are simply
ambiguous, but these perspectives are so unconstraining that one would
hope to adopt them only as a last resort. We can conclude only that the
evidence presented by Geurts & Pouscoulous (2009a) has not made a solid
case against the existence of local, embedded implicatures. We trust that
additional experimental research will clarify the conditions under which such
implicatures are made, and hope that additional linguistic analysis will shed
light on why these conditions encourage strengthening.
7:12
Embedded implicatures observed
References
Chemla, Emmanuel. 2009. Universal Implicatures and free choice effects: Ex-
perimental data. Semantics and Pragmatics 2(2). 1–33. doi:10.3765/sp.2.2.
Chierchia, Gennaro, Danny Fox & Benjamin Spector. 2008. The grammatical
view of scalar implicatures and the relationship between semantics and
pragmatics. In Claudia Maienborn, Klaus von Heusinger & Paul Portner
(eds.), Semantics: An international handbook of natural language mean-
ing, Berlin: Mouton de Gruyter. http://semanticsarchive.net/Archive/
WMzY2ZmY/CFS_EmbeddedSIs.pdf. To appear.
Geurts, Bart. 2009. Scalar implicatures and local pragmatics. Mind and
Language 24(1). 51–79. doi:10.1111/j.1468-0017.2008.01353.x.
Geurts, Bart & Nausicaa Pouscoulous. 2009a. Embedded implicatures?!?
Semantics and Pragmatics 2(4). 1–34. doi:10.3765/sp.2.4.
Geurts, Bart & Nausicaa Pouscoulous. 2009b. Free choice for all: a re-
sponse to Emmanuel Chemla. Semantics and Pragmatics 2(5). 1–10.
doi:10.3765/sp.2.5.
7:13
Semantics & Pragmatics Volume 3, Article 11: 1–28, 2010
doi: 10.3765/sp.3.11
1 Introduction
Neo-Gricean explanations of what is meant but not explicitly said are very
appealing. They start with what is explicitly expressed by an utterance, and
then seek to account for what is meant in a global way by comparing what
the speaker actually said with what he could have said. Recently, some
researchers (e.g., Levinson (2000), Chierchia (2006), Fox (2007)) have argued
that it is wrong to start with what is explicitly expressed by an utterance.
Instead — or so it is argued — implicatures should be calculated locally at
linguistic clauses. For what it is worth, I find the traditional globalist analysis
of implicatures more appealing, and all other things equal, I prefer the global
∗ The content of this paper was crucially inspired by Michael Franke’s dissertation, and earlier
work done on free choice permission by Katrin Schulz. Besides them, I would also like to
thank the reviewer of this paper and the editors of this journal (David Beaver in my case) for
their useful and precise comments on an earlier version of this paper.
analysis to a localist one. But, of course, not all things are equal. Localists
provided two types of arguments in favor of their view: experimental evidence
and linguistic data. I believe that the ultimate “decision” on which line to
take should, in the end, depend only on experimental evidence. I have not
much to say about this, but I admit to be happy with experimental results as
reported by Chemla (2009) and Geurts & Pouscoulous (2009a) which mostly
seem to favor a neo-Gricean explanation.
But localists provided linguistic examples as well, examples that according
to them could not be explained by standard “globalist” analyses. Impos-
sibility proofs in pragmatics, however, are hard to give. Many examples
involve triggers of scalar implicatures like or or some embedded under other
operators. Some early examples include φ ∨ (ψ ∨ χ) and (φ ∨ ψ). Localist
theories of implicatures were originally developed to account for examples
of this form. As for the first type of example, globalists soon pointed out
that these are actually unproblematic to account for. As for the second type,
Geurts & Pouscoulous (2009a) provide experimental evidence that implicature
triggers like or and some used under the scope of an operator like believe or
want do not necessarily give rise to local implicatures. That is, many more
participants of their experiments infer the implicature (1-b) from (1-a), than
infer (2-b) and (3-b) from (2-a) and (3-a), respectively. Moreover, they show
that there is little evidence that people in fact infer (3-b) from (3-a).
1 See Geurts & Pouscoulous 2009a and Geurts & Pouscoulous 2009b for discussion, and
footnote 18.
11:2
Conjunctive interpretation of disjunction
11:3
Robert van Rooij
For instance, intuitively we infer from (5-a) that both (5-b) and (5-c) are true:
(5) a. If Spain had fought on either the Allied side or the Nazi side, it
would have made Spain bankrupt.
b. If Spain had fought on the Allied side, it would have made Spain
bankrupt.
c. If Spain had fought on the Nazi side, it would have made Spain
bankrupt.
becomes valid. That is, by accepting SDA, we can derive MON on the as-
sumption that the connectives are interpreted in a Boolean way,5 and we
end up with a strict conditional account. We have seen already that the
strict conditional account (or the material conditional account) predicts SDA,
4 The assumption that for any world there is at least one closest φ-world for any consistent
φ — see Lewis 1973 for classic discussion.
5 From φ > χ and the assumption that connectives are interpreted in a Boolean way, we can
derive ((φ ∧ ψ) ∨ (φ ∧ ¬ψ)) > χ. By SDA we can then derive (φ ∧ ψ) > χ.
11:4
Conjunctive interpretation of disjunction
but perhaps for the wrong reasons. The Lewis/Stalnaker account does not
validate MON because SDA is not a theorem of their logic. Although there are
well-known counterexamples to SDA,6 we would still like to explain why it
holds in “normal” contexts. A simple “explanation” would be to say that a
conditional of the form (φ ∨ ψ) > χ can only be used appropriately in case
the best φ-worlds and the best ψ-worlds are equally similar to the actual
world. Though this suggestion gives the correct predictions, it is rather ad
hoc. We would like to have a “deeper” explanation of this desired result in
terms of a general theory of pragmatic interpretation.
11:5
Robert van Rooij
Let us first look at the latter approach according to which deontic operators
are construed as action modalities. Dynamic logic (Harel 1984) makes a
distinction between actions (and action expressions) and propositions. Propo-
sitions hold at states of affairs, whereas actions produce a change of state.
Actions may be nondeterministic, having different ways in which they can
be executed. The primary logical construct of standard dynamic logic is the
modality hαiφ, expressing that φ holds after α is performed. This modal-
ity operates on an action α and a proposition φ, and is true in world w if
some execution of the action α in w results in a state/world satisfying the
proposition φ.
Dynamic logic starts with two disjoint sets; one denoting atomic propo-
sitions, the other denoting atomic actions. The set of action expressions is
then defined to be the smallest set A containing the atomic actions such that
if α, β ∈ A, then α ∨ β ∈ A and α; β ∈ A.9 The set of propositions is defined
as usual, with the addition that it is assumed that if α is an action expres-
sion and φ a proposition, then hαiφ is a proposition as well. To account
for permission sentences we will assume that in that case also Per(α) is a
proposition.
Propositions are just true or false in a world. To interpret the action
expressions, it is easiest to let them denote pairs of worlds. The mapping
τ gives the interpretation of atomic actions. The mapping τ is extended to
give interpretations to all action expressions by τ(α; β) = τ(α); τ(β) and
τ(α ∨ β) = τ(α) ∪ τ(β). The action α; β consists of executing first α, and
then β. The action α ∨ β can be performed by executing either α or β. We
write τw (α) for the set {v ∈ W | hw, vi ∈ τ(α)}. Thus, τw (α) is the set of all
worlds you might end up in after performing α in w. We will say that Per(α)
is true in w, w î Per(α),10 just in case τw (α) ⊆ Pw , where Pw is the set of
8 There is yet another way to go, which recently became popular as well (e.g., Portner 2007):
assume that permission applies to an action, but assume that a permission statement also
changes what is permitted. I won’t go into this story here. Another story I won’t go into here
is the resource-sensitive logic approach to free choice permission proposed in Barker 2010,
a paper I became aware of just as the current paper was going to press.
9 I will ignore iteration here.
10 Strictly speaking the definition of î should be relativized to a model, but the model remains
implicit here as throughout the paper.
11:6
Conjunctive interpretation of disjunction
Lewis (1979) and Kamp (1973, 1979) have proposed a performative analysis
of command and permission sentences involving a master and his slave. On
their analysis, such sentences are not primarily used to make true assertions
about the world, but rather to change what the slave is obliged/permitted to
do.13 But how will permission sentences govern the change from the prior
permissibility set, Π, to the posterior one, Π0 ? Kamp (1979) proposes that
this change depends on a reprehensibility ordering, ≤, on possible worlds.
The effect of allowing φ is that the best φ-worlds are added to the old
permissibility set to figure in the new permissibility set. This set will be
∗
denoted as Πφ is and defined in terms of the relation ≤ as follows:
∗ def
(7) Πφ = {u ∈ φ | ∀v ∈ φ : u ≤ v}
Thus, the change induced by the permission You may do φ is that the new
∗
permission set, Π0 , is just Π ∪ Πφ . Note that according to this performative
account it does not follow that for a permission sentence of the form You
11 See Asher & Bonevac 2005 for a conditional analysis of permissions sentences.
12 Notice also that another paradox of standard deontic logic is avoided now: from the
permission of α, Per(α), the permission of α ∨ β, Per(α ∨ β) doesn’t follow.
13 For further discussion of this model, see e.g, van Rooij 2000.
11:7
Robert van Rooij
may do φ or ψ the slave can infer that according to the new permissibility set
he is allowed to do any of the disjuncts. Still, in terms of Kamp’s analysis we
can give a pragmatic explanation of why disjuncts are normally interpreted
in this “free choice” way. To explain this, let me first define a deontic
preference relation between propositions, ≺, in terms of our reprehensibility
relation between worlds, <. We can say that although both φ and ψ are
incompatible with the set of ideal worlds, φ is still preferred to ψ, φ ≺ ψ,
iff the best φ-worlds are better than the best ψ-worlds, ∃v ∈ φ and
∀u ∈ ψ : v < u. Then we can say that with respect to ≺, φ and ψ are
equally reprehensible, φ ≈ ψ, iff φ ψ and ψ φ. It is easily seen that
∗ ∗ ∗
Πφ∨ψ = Πφ ∪ Πψ iff φ ≈ ψ. How can we now explain the free choice effect?
According to a straightforward suggestion, a disjunctive permission can
only be made appropriately in case the disjuncts are equally reprehensible.14
This suggestion, of course, exactly parallels the earlier suggestions of when
conditionals with disjunctive antecedents can be used appropriately, or
disjunctive permissions according to the dynamic logic approach. Like these
earlier suggestions, however, this new suggestion by itself is rather ad hoc,
and one would like to provide a “deeper” explanation in terms of more
general principles of pragmatic reasoning.
3 Pragmatic interpretation
If the alternative of Some of the students passed is All of the students passed,
the desired scalar implicature is indeed accounted for. McCawley (1993)
noticed, however, that if one scalar item is embedded under another one — as
14 For an alternative proposal using this framework, see van Rooij 2006.
11:8
Conjunctive interpretation of disjunction
in (9)15 — an interpretation rule like Prag does not give rise to the desired
prediction that only one student passed if the alternatives are defined in the
traditional way.
The pragmatic interpretation rule Exh correctly predicts that from (9) we can
pragmatically infer that only one of Alice, Bob, and Cindy passed. In fact, this
pragmatic interpretation rule is better known as the exhaustive interpretation
of a sentence (e.g., Groenendijk & Stokhof 1984, van Rooij & Schulz 2004,
Schulz & van Rooij 2006, Spector 2003, 2006). By interpreting sentences
exhaustively one can account for many conversational implicatures. But
from a purely Gricean point of view, the rule is too strong. All that the
Gricean maxims seem to allow us to conclude from a sentence like Some of
the students passed is that the speaker does not know that All of the students
passed is true; not the stronger proposition that the latter sentence is false.
To account for this intuition, the following weaker interpretation rule, Grice,
can be stated, which talks about knowledge rather than facts (where Kφ
means that the speaker knows φ):16
11:9
Robert van Rooij
As shown by Spector (2003) and van Rooij & Schulz (2004), exhaustive inter-
pretation follows from this, if we assume that the speaker is as competent as
possible insofar as this is compatible with Grice.
11:10
Conjunctive interpretation of disjunction
that the competence assumption is formalized in the wrong way. Indeed, this
was proposed by Schulz (2003, 2005) to account for free choice permissions.
As for the latter, she took a speaker to be competent in case she knows of
each alternative whether it is true. Second, she took the set of alternatives
of ♦φ to be the set {ψ : ψ ∈ A(φ)} ∪ {¬ψ : ψ ∈ A(φ)}.18 First, notice
that by applying Grice to a sentence of the form ♦(φ ∨ ψ) it immediately
follows that the speaker knows neither ¬φ nor ¬ψ, in formulas, ¬K¬φ
and ¬K¬ψ. What we would like is that from here we derive the free choice
reading: ♦φ and ♦ψ, which would follow from K¬¬φ and K¬¬ψ. Of
course, this doesn’t follow yet, because it might be that the speaker does not
know what the agent may or must do.19 But now assume that the speaker is
competent on this in Schulz’ sense. Intuitively, this means that Pφ ≡ Kφ
and P♦φ ≡ K♦φ. Remember that after applying Grice, it is predicted that
neither K¬φ nor K¬ψ holds, which means that P¬¬φ and P¬¬φ have
to be true. The latter, in turn, are equivalent to P♦φ and P♦ψ. By competence
we can now immediately conclude to K♦φ and K♦ψ, from which we can
derive ♦φ and ♦ψ as desired, because knowledge implies truth.20
Although I find this analysis appealing, it is controversial, mainly because
of her choice of alternatives. This also holds for other proposed pragmatic
analyses to account for free choice permissions, such as, for example, that of
Kratzer & Shimoyama (2002). In section 4 I will discuss some other possible
analyses that explain the desired free choice inference that assume that the
alternatives of (φ ∨ ψ) > χ and ♦(φ ∨ ψ) are φ > χ and ψ > χ, and ♦φ and
♦ψ, respectively.
18 Taking φ as an alternative is natural to infer from ♦φ to the falsity of this necessity
statement.
19 Notice, though, that this inference does follow if ‘’ and ‘♦’ stand for epistemic must and
epistemic might. This is so, because for the epistemic case we can safely assume that the
speaker knows what he believes, which can be modeled by taking the epistemic accessibility
relation to be fully introspective. This gives the correct predictions, because from Katrin
might be at home or at work, it intuitively follows that, according to the speaker, Katrin
might be at home, and that she might be at work (cf., Zimmermann 2000).
20 Notice that it is also Schulz’ reasoning and notion of competence for Anna ate all of the
cookies that is used to explain why from (2-a) we conclude to (2-b).
11:11
Robert van Rooij
11:12
Conjunctive interpretation of disjunction
pragmatically entails φ > χ and ψ > χ, because if not, the speaker could
have used an alternative expression which more accurately singled out the
actual world.
Intuitive solutions are ok, but to test them, we have to make them precise.
In the following I will suggest two ways to implement the above intuition.
Both implementations are based on the idea that to account for the desired
“conjunctive” inferences of the disjunctive sentences, alternative expres-
sions and alternative worlds/interpretations must play a very similar role
in pragmatic interpretation. Thinking of it in somewhat different terms,
we should take seriously both the speaker’s and the hearer’s perspective.
Fortunately, there are two well-known theories on the market that look at
pragmatic interpretation from such a point of view: Bi-directional Optimality
Theory (e.g., Blutner 2000), and Game Theory (e.g., Benz, Jäger & van Rooij
2005). In the following I will discuss two possible ways to proceed, but they
have something crucial in common: both ways make use of different levels
of interpretation. The first proposal is game-theoretic in nature, and due
to Franke (2010, 2009). The second suggestion is a less radical departure
from the “received view” in pragmatics, and is more in the spirit of Bi-OT.
It makes crucial use of exhaustive interpretation and of different levels of
interpretation, but like in Bi-OT, alternative worlds and expressions that
initially played a role in interpretation need not play a role anymore at higher
levels.23
11:13
Robert van Rooij
P
P (w | φ) u v w w P (w | φ)
1 1 1
Some 3 3 3
= 1
1 1
Most 0 2 2
= 1
All 0 0 1 = 1
P
P (w | f ) uφ≺ψ vψ≺φ wφ≈ψ w P (w | f )
1 1
φ>χ 2
0 2
= 1
1 1
ψ>χ 0 2 2
= 1
1 1 1
(φ ∨ ψ) > χ 3 3 3
= 1
On the assumption that all worlds are equally likely, here is a straightforward
way to reformulate (8) making use of probabilities:
11:14
Conjunctive interpretation of disjunction
A number of things are worth remarking. First of all, all sentences are
true in wφ≈ψ . As a result of this, a (naive) hearer will interpret, for instance,
φ > χ as equally likely true in uφ≺ψ as in wφ≈ψ . Now take the speaker’s
perspective. Which statement would, or should, she make given that she is in
a particular situation, or world? Naturally, that statement that gives her the
highest chance that the (naive) hearer will interpret the message correctly.
Thus, she should utter that sentence which gives the highest number in the
column. But this means that in uφ≺ψ she should (and rationally would) utter
φ > χ, in vψ≺φ she should utter ψ > χ, and in wφ≈ψ it doesn’t matter what
she utters, both are equally good. The boxed entries model this speaker’s
choice. The important thing to note is that according to this reasoning, no
speaker (a speaker in no world) would ever utter (φ ∨ ψ) > χ. Still, this
is exactly the message that was uttered and should be interpreted, so we
obviously missed something.
Franke (2009) proposes that our reasoning didn’t go far enough. We
should now take the hearer’s perspective again, taking into account the
optimal speaker’s message choice given a naive semantic interpretation of the
hearer.25 This can best be represented by modeling the probabilities of the
messages sent according to the previous reasoning, given the situation/world
that the speaker is in.26 How should the hearer now interpret the messages?
Well, because the speaker would always send φ > χ in uφ≺ψ , while the chance
that she sends φ > χ in wφ≈ψ is lower (and taking the a priori probabilities
of the worlds to be equal), there is a higher chance that the speaker of φ > χ
is in world uφ≺ψ than in wφ≈ψ , and thus the hearer will choose accordingly.
This is represented by the boxed entry in figure 3 (in which P (f | w) stands
for the probability with which the speaker would say f if she were in w).
Something similar holds for ψ > χ. As for (φ ∨ ψ) > χ, it is clear that all
worlds are equally likely now, given that a previous speaker would not make
this utterance in any of those worlds.
Having specified how such a more sophisticated hearer would interpret
the alternative utterances, we turn back to the speaker, but now assume that
the speaker takes such a more sophisticated hearer into account. First we fill
in the probabilities of the worlds, given the previous reasoning. Notice that
these probabilities are crucially different from the earlier P (w | f ). The
speaker now chooses optimally given these probabilities: i.e., the speaker
25 Jäger & Ebert (2009) make a similar move. Both models are instances of Iterated Best
Response (IBR) models.
26 For a more precise description, the reader should consult Franke 2009, obviously.
11:15
Robert van Rooij
(φ ∨ ψ) > χ 0 0 0
P
f P (f | w) 1 1 1
P
P (w | f ) uφ≺ψ vψ≺φ wφ≈ψ w P (w | f )
φ>χ 1 0 0 = 1
ψ>χ 0 1 0 = 1
1 1 1
(φ ∨ ψ) > χ 3 3 3
= 1
chooses (one of) the highest rows in the columns. In uφ≺ψ and vψ≺φ she
would choose as before, but in wφ≈ψ she now chooses (φ ∨ ψ) > χ instead
of either of the others. This is again represented by boxed entries in figure 4.
If we take the hearer’s perspective again, the iteration finally reaches a
fixed point. As illustrated by figure 5, (φ ∨ ψ) > χ is now interpreted by
the even more sophisticated hearer in the desired way. From the truth of
(φ ∨ ψ) > χ, both φ > χ and ψ > χ pragmatically follow.
Franke (2010, 2009) shows that by exactly the same reasoning free choice
permissions are accounted for as well.27 What is more, using exactly the same
machinery he can even explain (by making use of global reasoning) why we
infer from (φ ∨ ψ) > χ and ♦(φ ∨ ψ) that the alternatives (φ ∧ ψ) > χ and
♦(φ ∧ ψ) are not true, inferences that are sometimes taken to point to a local
analysis of implicature calculation.
27 Franke uses standard deontic logic, but that doesn’t seem essential. Starting with one of the
two more dynamic approaches, he could explain the free choice inference as well using a
very similar reasoning.
11:16
Conjunctive interpretation of disjunction
φ>χ 1 0 0
ψ>χ 0 1 0
(φ ∨ ψ) > χ 0 0 1
P
f P (f | w) 1 1 1
Is there any relation between the above game-theoretic reasoning and the
“received” analysis making use of pragmatic interpretation rule (8) or that of
exhaustive interpretation, (10)? I will suggest that a “bidirectional” received
view is at least very similar to Franke’s proposal sketched above, and does
the desired work as well.
In the above explanation, we started with looking at the semantic inter-
pretation from the hearer’s point of view. This way of starting things was
motivated by pragmatic interpretation rule (8):
But we could have started with the pragmatic interpretation rule (10) as well.
In that case we wouldn’t have started from the hearer’s, but rather from the
speaker’s point of view. Also this would have given rise to a reformulation
and a table, but now the probability function, P (ψ | w), gives the probabilities
with which the speaker would have used the alternative expression ψ given
the world w she is in. The naive assumption now is that P (ψ | w) is simply
1
card({χ∈A(φ) : wîχ})
, if w î ψ, and 0 otherwise. The reformulation now looks as
follows:
0
(13) Exh (φ) = {w ∈ φ | ¬∃v : P (φ | v) > P (φ | w)}.
For the simple scalar implicature, the table to start with from a naive speaker’s
11:17
Robert van Rooij
P (φ | w) u v w
1 1
Some 1 2 3
1 1
Most 0 2 3
1
All 0 0 3
P
f P (f | w) 1 1 1
P (f | w) u v w
1 1
♦φ 2
0 3
1 1
♦ψ 0 2 3
1 1 1
♦(φ ∨ ψ) 2 2 3
P
f P (f | w) 1 1 1
11:18
Conjunctive interpretation of disjunction
K def
(14) Exhnn (φ) = {w ∈ φKn | ¬∃v ∈ φKn : v <An (φ) w}.
def K
(15) PragKnn (φ) = {w ∈ Exhnn (φ) | ¬∃ψ ∈ An (φ), w ∈ ψKn & ψ ≺n φ}.
def K
(16) Kn+1 = {w ∈ φKn | w 6∈ Exhnn (φ)}.
def K
(17) An+1 (φ) = {ψ ∈ An (φ) | ¬∃w ∈ Exhnn (φ), w ∈ ψKn & ψ ≺n φ}.
11:19
Robert van Rooij
Notice that (14) and (15) are just the straightforward generalizations with
respect to a set of worlds K of standard exhaustive interpretation rule (10)
and pragmatic interpretation rule (8) respectively.
K
(10) Exh (φ) = {w ∈ φK | ¬∃v ∈ φK : v <A(φ) w}.
(8) PragK (φ) = {w ∈ φK | ¬∃ψ ∈ A(φ), w ∈ ψK & ψK ⊂ φK }
The only difference between (14) and (10) is that the relevant set of worlds
and the relevant set of alternatives might depend on earlier stages in the
interpretation. If we limit ourselves to the first interpretation (i.e., level 0),
the two interpretation rules are identical. Similarly for the difference between
(15) and (8): the relevant alternatives depend on earlier stages, and the set of
worlds with respect to which the entailment relation between ψ and φ must
be determined depends on earlier stages as well. Indeed, if we look at the
first interpretation, the only important difference is that (15) takes as input
the exhaustive interpretation of φ, while this is not the case for (8). This
difference implements the view that speaker’s and hearer’s perspective are
both required.
The definitions (16) and (17) determine which worlds and alternative
expressions are relevant for the interpretation at the n + 1th level of inter-
pretation. We start with interpretation 0 (the first interpretation). Notice first
K
that level 1 is only reached in case Prag0 0 (φ) = , i.e., in case for each world
v in the exhaustive interpretation of φ there is an alternative expression ψ
that is true in v and which is stronger than φ. Thus, in that case there is no
world v ∈ Exh(φ) such that φ is at least as specific as any other alternative
that is true in v. For the interpretation φ at level 1 we will not consider
worlds in the 0th -level exhaustive interpretation of φ anymore. This is what
(16) implements. The new set of alternatives determined by (17) are those
elements of the original set of alternatives A0 that did not help to eliminate
worlds in Exh(φ) at the 0th -level of interpretation.
Let us see how things work out for some particular examples. Let us
first look at ♦(φ ∨ ψ) with A(♦(φ ∨ ψ)) = {♦φ, ♦ψ, ♦(φ ∧ ψ)}, and assume
that K = {u, v, w, x}, ♦(φ ∨ ψ) = {u, v, w, x}, ♦φ = {u, w, x}, ♦ψ =
K
{v, w, x}, and ♦(φ∧ψ) = {x}. Observe that Exh0 0 (♦(φ∨ψ)) = {u, v}. But
K
neither u nor v can be an element of Prag0 0 (♦(φ ∨ ψ)), because ♦φK0 ⊂
K
♦(φ ∨ ψ) and ♦ψK0 ⊂ ♦(φ ∨ ψ). It follows that Prag0 0 (♦(φ ∨ ψ)) = .
We continue, and calculate K1 and A1 (♦(φ ∨ ψ)). The new set of worlds
K
we have to consider, K1 , is just K − Exh0 0 (♦(φ ∨ ψ)) = {w, x}. The new
11:20
Conjunctive interpretation of disjunction
11:21
Robert van Rooij
What about ♦(φ ∨ ψ ∨ χ), for instance? Once again we have to make a
closure assumption concerning the alternatives. As it turns out, the correct
way to go is also the most natural one: first, A(♦φ) = {♦ψ : ψ ∈ A(φ)},
and second, A(φ ∨ ψ ∨ χ) = {φ, ψ, χ, φ ∧ ψ, φ ∧ χ, ψ ∧ χ, φ ∧ ψ ∧ χ, φ ∨
ψ, φ ∨ χ, ψ ∨ χ}. Thus, at the “local” level, the alternatives are closed under
disjunction as well. Let us now assume that ♦φ = {w1 , w4 , w5 , w7 }, ♦ψ =
{w2 , w4 , w6 , w7 }, and ♦χ = {w3 , w5 , w6 , w7 }. Let’s assume for simplicity
that in none of these worlds any conjunctive permission like ♦(φ ∧ ψ) is
K
true. Observe that Exh0 0 (♦(φ ∨ ψ ∨ χ)) = {w1 , w2 , w3 }. It follows that
K1 = {w4 , w5 , w6 , w7 } and the new set of alternatives is the earlier set minus
K
{♦φ, ♦ψ, ♦χ}. The new exhaustive interpretation will be Exh1 1 (♦(φ ∨ ψ ∨
K
χ)) = {w4 , w5 , w6 }, but all these worlds are ruled out for Prag1 1 (♦(φ∨ψ∨χ))
because of our disjunctive alternatives. This means that we have to go to
the next level. At level 2, the new set of worlds is just {w8 }, which is thus
K
also Exh2 2 (♦(φ ∨ ψ ∨ χ)). World w8 cannot be eliminated by a more precise
K
alternative, which means that also Prag2 2 (♦(φ ∨ ψ ∨ χ)) = {w8 }, which is
what PragK (♦(φ ∨ ψ ∨ χ)) will then denote as well. Notice that in w8 it holds
that all of ♦φ, ♦ψ, and ♦χ are true: the desired free choice inference. Similar
reasoning applies to (φ ∨ ψ ∨ χ) > ξ.
These calculations have made clear that to account for free choice per-
mission, we have to make use of exhaustive interpretation several times. In
this sense it is similar to the analysis proposed by Fox (2007). Still, there
are some important differences. One major difference is that Fox (2007)
exhaustifies not only the sentence that is asserted, but also the relevant
alternatives. Moreover, Fox uses exhaustification to turn alternatives into
other alternatives, thereby “syntacticising” the process. We don’t do anything
like this, and therefore feel that what we do is more in line with the Gricean
approach. Exhaustification always means looking at “minimal” worlds: we
don’t change the alternatives. The worst that can happen to them is that
they are declared not to be relevant anymore to determine the pragmatic
interpretation.
Notice that our analysis also immediately explains why it is appropriate to
use any under ♦, but not under : whereas ♦(φ∨ψ∨χ) pragmatically entails
♦(φ∨χ), (φ∨ψ∨χ) does not pragmatically entail (φ∨χ). It is easy to see
that our analysis can account for the “free choice” inference of the existential
sentence as well: that from Several of my cousins had cherries or strawberries
we naturally infer that some of the cousins had cherries and some had
11:22
Conjunctive interpretation of disjunction
that PragKn
n
(ψ) =
6
11:23
Robert van Rooij
where ∃x♦P x and ∃x♦Qx are true but both ∀x♦P x and ∀x♦Qx false is
eliminated, because such a world could be more accurately expressed (given
the truth of ∀x♦(P x ∨ Qx)) by the alternative ∃x♦(P x ∨ Qx). While the
inclusion of ∃x♦(P x ∨Qx) among the alternatives of ∀x♦(P x ∨Qx) is not a
significant change to our framework, it has to be admitted that the exchange
of the notion ψKn by the pragmatic interpretation of ψ is significant. From
an intuitive point of view, the effect of this exchange would be that we do
not only look at the exhaustive interpretation of φ, the sentence asserted,
but also at the exhaustive interpretations of the alternatives. As a result, our
analysis would become much closer to the proposal of Fox (2007). But, as
mentioned above, if we were to adopt the suggestion of Geurts & Pouscoulous
(2009b), this would, in fact, not be the way to go.
5 Conclusion
The papers of Geurts & Pouscoulous (2009a) and Chemla (2009) provide
strong empirical evidence that sentences in which a trigger of a scalar impli-
cature occurs under a universal does not in general give rise to an embedded
implicature. This evidence favors a globalist analysis of conversational impli-
catures over its localist alternative. As far as I know, it is uncontroversial that
triggers occurring under an existential do give rise to implicatures. In this
paper, and following Franke (2010, 2009), I discussed some ways in which
these challenging examples for a “globalist” analysis of conversional impli-
catures could be given a principled global pragmatic explanation after all. I
suggested how potentially problematic examples for our global pragmatic
analysis of the form ∀x♦(P x ∨ Qx), as discussed by Chemla (2009), could
be treated as well. At least two things have to be admitted, though. First,
our global analysis still demands that the alternatives are calculated locally.
I don’t think this is a major concession to localists. Second, according to
Zimmermann (2000), even a disjunctive permission of the form You may do
φ or you may do ψ gives rise to the free choice inference, and according
to Merin (1992) a conjunctive permission of the form You may do φ and ψ
allows the addressee to perform only φ. I have no idea how to pragmati-
cally account for those intuitions without reinterpreting the semantics of
conjunction as well as disjunction. If our analysis is acceptable, it points to
the direction in which richer pragmatic theories have to go: (i) we have to
take both the speaker’s and the hearer’s perspective into account, and (ii)
one-step inferences (or strong Bi-OT) are not enough, more reasoning has to
11:24
Conjunctive interpretation of disjunction
be taken into account (i.e., weak Bi-OT, or iteration). These are what I take to
be the main messages of this paper.
References
11:25
Robert van Rooij
11:26
Conjunctive interpretation of disjunction
11:27
Robert van Rooij
Schulz, Katrin. 2003. You may read it now or later: A case study on the
paradox of free choice permission. Amsterdam: University of Amster-
dam MA thesis. http://www.illc.uva.nl/Publications/ResearchReports/
MoL-2004-01.text.pdf.
Schulz, Katrin. 2005. A pragmatic solution for the paradox of free choice
permission. Synthese: Knowledge, Rationality and Action 147(2). 343–377.
doi:10.1007/s11229-005-1353-y.
Schulz, Katrin & Robert van Rooij. 2006. Pragmatic meaning and non-
monotonic reasoning: The case of exhaustive interpretation. Linguistics
and Philosophy 29(2). 205–250. doi:10.1007/s10988-005-3760-4.
Spector, Benjamin. 2003. Scalar implicatures: Exhaustivity and Gricean
reasoning. In Balder ten Cate (ed.), Eighth ESSLLI Student Session (European
Summer School in Logic, Language and Information), 277–288. Vienna.
http://www.cs.ucsc.edu/~btencate/esslli03/stus2003proc.pdf.
Spector, Benjamin. 2006. Aspects de la pragmatique des operateurs logiques.
Paris: University of Paris VII dissertation. http://cognition.ens.fr/
~bspector/THESE_SPECTOR/THESE_SPECTOR_AVEC_ANNEXE2.pdf.
von Wright, G. H. 1950. Deontic logic. Mind 60(237). 1–15.
doi:10.1093/mind/LX.237.1.
Zimmermann, Thomas Ede. 2000. Free choice disjunction and epis-
temic possibility. Natural Language Semantics 8(4). 255–290.
doi:10.1023/A:1011255819284.
11:28