(Linguistic Society of America) Semantics Pragma PDF

Semantics & Pragmatics Volume 3, Article 1: 1–72, 2010
doi: 10.3765/sp.3.1
Quantifiers in than-clauses∗
Sigrid Beck
University of Tübingen
Received 2009-01-13 / First Decision 2009-03-17 / Revised 2009-06-17 / Second

Decision 2009-07-06 / Revised 2009-07-27 / Accepted 2009-07-27 / Published 2010-
01-25
Abstract
The paper reexamines the interpretations that quantifiers in than-clauses
give rise to. It develops an analysis that combines an interval semantics for
the than-clause with a standard semantics for the comparative operator. In
order to mediate between the two, interpretive mechanisms like maximality
and maximal informativity determine selection of a point from an interval.
The interval semantics allows local interpretation of the quantifier. Selection
predicts which interpretation this leads to. Cases in which the prediction
appears not to be met are explained via recourse to independently attested
external factors (e.g. the interpretive possibilities of indefinites). The goal
of the paper is to achieve coverage of the relevant data while maintaining a
simple semantics for the comparative. A secondary objective is to reexamine,
restructure and extend the set of data considered in connection with the
problem of quantifiers in than-clauses.
Keywords: comparatives, degrees, intervals, quantifiers, indefinites, plurals, scope
∗ Versions of this paper were presented at the workshop on covert variables in Tübingen 2006,
at two Semantic Network meetings (in Barcelona 2006 and Oslo 2007), at the 2009 Topics
in Semantics seminar at MIT, and at the Universität Frankfurt 2009. I would like to thank
the organizers Frank Richter and Uli Sauerland and the audiences at these presentations for
important feedback. Robert van Rooij and Jon Gajewski have exchanged ideas with me. The
B17 project of the SFB 441 has accompanied the work presented here — Remus Gergel, Stefan
Hofstetter, Sveta Krasikova, John Vanderelst — as have Arnim von Stechow and Irene Heim.
Several anonymous reviewers and Danny Fox have given feedback on earlier versions, and
David Beaver and Kai von Fintel have commented on the prefinal version. I am very grateful
to them all.
©2010 Sigrid Beck

This is an open-access article distributed under the terms of a Creative Commons Non-
Commercial License (creativecommons.org/licenses/by-nc/3.0).
Sigrid Beck
1 Introduction
The problem of quantifiers in than-clauses has been puzzling linguists for a

long time, beginning with von Stechow 1984, via Schwarzschild & Wilkinson
2002, Schwarzschild 2004, and Heim 2006b, to very recent approaches in
Gajewski 2008, van Rooij 2008 and Schwarzschild 2008. It can be illustrated
with the examples below.
(1) John ran faster than every girl did.

(10 ) a. For all x, x is a girl: John ran faster than x.
b. #The degree of speed that John reached exceeds the degree of speed
that every girl reached.
i.e. “John’s speed exceeds the speed of the slowest girl.”
(2) John ran faster than he had to.
(20 ) a. #For all w, w is a permissible world: John ran faster in @ than he
ran in w.
b. The degree of speed that John reached in @ exceeds the degree of
speed that he has in every permissible world w.
i.e. “John’s actual speed exceeds the slowest permissible speed.”
(@ stands for the real world)
Example (1) intuitively only has a reading that appears to give the universal
NP scope over the comparison, namely (10 a): all the girls were slower than
John. The reading in which the universal NP takes narrow scope relative to
the comparison is paraphrased in (10 b). Here we must look at degrees of
speed reached by all girls; depending on the precise semantics of the than-
clause (see below), this could mean the maximal speed that they all reached,
i.e. the speed of the slowest girl. Example (1) has no reading that compares
John’s speed to the speed of the slowest girl. Sentence (2), on the other hand,
only has a reading that gives the modal universal quantifier narrow scope
relative to the comparison, (20 b). That is, we consider the degrees of speed
that John reaches in all worlds compatible with the rules imposed by the
modal base of have to. This will yield the slowest permissible speed, and (2)
intuitively says that John’s actual speed exceeded this minimum requirement.
The sentence is not1 understood to mean that John did something that was
1 Heim (2006b) and Krasikova (2008) include a discussion of when readings like (20 a) are
available. The reading can be made more plausible with a suitable context, depending on the
modal chosen. For the moment I will stick to the simpler picture presented in the text. See
1:2
Quantifiers in than-clauses
against the rules — that is, reading (20 a), in which the modal takes scope over
the comparison, is not available.
We must ask ourselves how a quantifier contained in the than-clause can
have wide scope at all, why it cannot get narrow scope in (1), and why (2) is the
opposite. Since — as we will see in more detail below — these questions look
unanswerable under the standard analysis of comparatives, the researchers
cited above have been led to a revision of the semantic analysis of comparison.
Schwarzschild & Wilkinson (2002) employ an interval semantics for the than-
clause and give the comparative itself an interval semantics. Heim (2006b)
adopts intervals, but ultimately reduces the semantics of the comparison
back to a degree semantics through semantic reconstruction. This allows
her to retain a simple meaning of the comparative operator. A than-clause
internal operator derives the different readings that quantifiers in than-
clauses give rise to. The line of research in Gajewski 2008, van Rooij 2008
and Schwarzschild 2008 in turn adopts the idea of a than-clause internal
operator but not the intervals.
In this paper, I pursue a strategy that can be seen as an attempt to simplify
Schwarzschild & Wilkinson’s proposal. Like them, I derive a meaning for
the than-clause without a than-clause internal operator, and that meaning is
based on an interval semantics. But I combine this with a standard semantics
of the comparative in the spirit of von Stechow 1984. This means that the
end result of interpreting the than-clause must be a degree. Everything will
hinge on selecting the right degree, so that each of the relevant examples
receives the right interpretation.
In Section 2, I present the current state of our knowledge in this domain.
The analysis of than-clauses is presented in Section 3. Section 4 ends the
paper with a summary and some discussion of consequences of the proposed
analysis.
2 State of affairs
I first present a sample of data that I take to be representative of the inter-

pretational possibilities that arise with quantifiers in than-clauses. Then I
sketch Schwarzschild & Wilkinson’s (2002) and Heim’s (2006b) analyses in
Section 2.2, and in Section 2.3 a summary of the proposals in Gajewski 2008,
van Rooij 2008 and Schwarzschild 2008.
Section 3 for more discussion.
1:3
Sigrid Beck
2.1 The empirical picture
2.1.1 A classical analysis of the comparative
The basis of our present perception of the problem presented by (1) and
(2) is the analysis of the comparative construction, because the data are
understood in terms of whether the quantfier appears to take wide scope
over the comparison according to a classical analysis of the comparative,
or whether it would have to be seen as taking narrow scope relative to the
comparison. My presentation assumes a general theoretical framework like
Heim & Kratzer 1998 and begins with specifically Heim’s (2001) version of
the theory of comparison promoted in von Stechow 1984 (see also Klein 1991
and Beck 2009 for an exposition and Cresswell 1977; Hellan 1981; Hoeksema
1983; Seuren 1978 for theoretical predecessors). This theory is what I will
refer to as a classical analysis of the comparative. For illustration, I discuss
the simple example (3a) below. In (3b) I provide the Logical Form and in (3c)
the truth conditions derived by compositional interpretation of that Logical
Form, plus paraphrase. Interpretation relies on the lexical entries of the
comparative morpheme and gradable adjectives as given in (4).
(3) a. Paule is older than Knut is.

b. [-er [hd,ti than 2 [Knut is t2 old]]
[hd,ti 2 [Paule is t2 old]]]
c. max(λd. Paule is d-old) > max(λd. Knut is d-old) =
Age(Paule) > Age(Knut)
“The largest degree of age that Paule reaches exceeds the largest
degree of age that Knut reaches.”
“Paule’s age exceeds Knut’s age.”
0
(4) a. -er = λDhd,ti . λDhd,ti . max(D 0 ) > max(D)
b. oldhd,he,tii = [λd. λx. x is d-old]
= [λd. λx. Age(x) ≥ d]
c. Let S be a set ordered by R.
Then maxR (S) = ιs[s ∈ S & ∀s 0 ∈ S[sRs 0 ]]
Importantly, the role of the comparative operator is ultimately to relate

the maximal degree provided by the than-clause to some matrix clause
degree. The than-clause provides degrees through abstraction over the
degree argument slot of the adjective. Different versions of such a classical
analysis are available (for instance von Stechow’s (1984) own or Kennedy’s
1:4
(1997)), but the problem of quantifiers in than-clauses presents itself in a

parallel fashion in all of them.
I will make one small revision to the above version of the classical analysis:
I will suppose that what is written into the lexical entry of the comparative
morpheme as the maximality operator in (4a) is not actually part of the
meaning of the comparative itself. Rather, it is a general mechanism that
allows us to go from a description of a set to a particular object, for example
also in the case of free relative clauses in (5) (Jacobson 1995); see also Beck
2009. I represent maximality in the Logical Form, as indicated in (40 b). The
meaning of the comparative is then simply (40 a), the ‘larger than’ relation. It
is basically this meaning of the comparative that I will try to defend below.
The resulting interpretation remains of course the same.
(5) a. We bought [what we liked].

b. max(λx. we liked x)
(40 ) a. -er = λdd .λd0d . d0 > d
b. [-er [d than max 2 [Knut is t2 old]]
[d max 2 [Paule is t2 old]]]
c. max(λd. Paule is d-old) > max(λd. Knut is d-old)
2.1.2 Apparent wide scope quantifiers
Universal NPs are a standard example for an apparent wide scope quantifier
(see e.g. Heim 2006b). The sentence in (6) below only permits the reading in
(60 a), not the one in (60 b). This can be seen from the fact that the sentence
would be judged false in the situation depicted below.
(6) John is taller than every girl is.

(60 ) a. ∀x[girl(x) → max(λd. John is d-tall) > max(λd. x is d-tall)]
“For every girl x: John’s height exceeds x’s height.”
b. #max(λd. John is d-tall) > max(λd. ∀x[girl(x) → x is d-tall])
“John’s height exceeds the largest degree to which every girl is
tall.”
“John is taller than the shortest girl.”
_ _ _ _ _ • _ _ _ _ _ _ _ _ • _ _ _ _ _ _ _ _ • _ _ _ _ _ _ _ _ • _ _ _ _ _/
g1 ’s height J’s height g2 ’s height g3 ’s height
1:5
Sigrid Beck
The classical semantics of comparatives makes this look as if the NP had

to take scope over the comparative. The LF given in (600 a) can straightfor-
wardly be interpreted to yield (60 a); analogously for (600 b) and (60 b). Thus,
strangely, the sentence appears to permit with (600 a) only an LF which violates
constraints on Quantifier Raising (QR): QR is normally confined to a simple
finite clause (May 1985 and much subsequent work). The LF in (600 b), which
would be unproblematic syntactically, is not possible.
(600 ) a. [[every girl] [1 [[-er [d than max 2 [t1 is t2 tall]]

[d max 2 [John is t2 tall]]]]
b. [[-er [d than max 2 [every girl] [ 1 [t1 is t2 tall]]]
[d max 2 [John is t2 tall]]]
The example with the differential in (7) shows the same behaviour (it uses a
version of the comparative that accomodates a difference degree, (7c)).
(7) a. John is 200 taller than every girl is.

b. ∀x[girl(x) → max(λd. John is d-tall) ≥ max(λd. x is d-tall) + 200 ]
= For every girl x: John’s height exceeds x’s height by 200 .
c. -erdiff = λd. λd0 . λd00 . d00 ≥ d + d0
The problem posed by (5) and (7) is exacerbated in (8), as Schwarzschild &
Wilkinson (2002) observe. We have once more a universal quantifier, but
this time it is one that is taken to be immobile at LF: the intensional verb
predict. Still, the interpretation that is intuitively available looks to be one
in which the universal outscopes the comparison, (80 a). The interpretation
in which comparison takes scope over predict, (80 b), is not possible. This is
problematic because the LF we would expect (8) to have is (10), and (10) is
straightforwardly interpreted to yield (80 b).
(8) John is taller than I had predicted (that he would be).

(9) My prediction: John will be between 1.70 m and 1.80 m.
Claim made by (8): John is taller than 1.80 m.
(80 ) a. ∀w[wR@ →
max(λd. John is d-tall in @) > max(λd. John is d-tall in w)]
“For every world compatible with my predictions: John’s actual
height exceeds John’s height in that world.”
b. # max(λd. John is d-tall in @)
> max(λd. ∀w[wR@ → John is d-tall in w])
1:6
“John’s actual height exceeds the degree of tallness which he has

in all worlds compatible with my predictions.”
“John’s actual height exceeds the shortest prediction, 1.70 m.”
(where R is the relevant accessibility relation, compare e.g. Kratzer
1991)
(10) [[-er [hd,ti than max 2 [ I had predicted that [ John be t2 tall]]]
[hd,ti max 2 [ John is t2 tall]]]
This is the interpretive behaviour of many quantified NPs, plural NPs like
the girls, quantificational adverbs, verbs of propositional attitude and some
modals (e.g. should, ought to, might). See Schwarzschild & Wilkinson 2002
and Heim 2006b for a more thorough empirical discussion.
2.1.3 Apparent narrow scope quantifiers
Not all quantificational elements show this behaviour. A universal quantifier

that does not is the modal have to, along with some others (be required, be
necessary, need). This is illustrated below.
(11) Mary is taller than she has to be.

(12) Mary wants to play basketball. The school rules require all players to
be at least 1.70 m. Claim made by (11): Mary is taller than 1.70 m.
(110 ) a. ?#∀w[wR@ →
max(λd. Mary is d-tall in @) > max(λd. Mary is d-tall in w)]
= For every world compatible with the school rules:
Mary’s actual height exceeds Mary’s height in that world;
i.e. Mary is too tall.
b. max(λd. Mary is d-tall in @)
> max(λd. ∀w[wR@ → Mary is d-tall in w])
= Mary’s actual height exceeds the degree of tallness which she
has in all worlds compatible with the school rules;
i.e. Mary’s actual height exceeds the required minimum, 1.70 m.
These modals permit what appears to be a narrow scope interpretation

relative to the comparison. Example (11) does not favour an apparent wide
scope interpretation. Krasikova (2008) argues though that some examples
with have to–type modals may have both readings, depending on context. (13)
is one of her examples favouring a reading analogous to (110 a), an apparent
1:7
Sigrid Beck
wide scope reading of have to (see Section 3 for more discussion).
(13) He was coming through later than he had to if he were going to retain
the overall lead. (from Google, cited from Krasikova 2008)
= He was coming through too late.
Existential modals like be allowed also appear to take narrow scope:
(14) Mary is taller than she is allowed to be.

(15) a. #∃w[wR@ &
max(λd. Mary is d-tall in @) > max(λd .Mary is d-tall in w)]
= It would be allowed for Mary to be shorter than she actually is.
> max(λd. ∃w[wR@ & Mary is d-tall in w])
= Mary’s actual height exceeds the largest degree of tallness that
she reaches in some permissible world; i.e. Mary’s actual height
exceeds the permitted maximum.
And so do some other existential quantifiers and disjunction:
(16) Mary is taller than anyone else is.

(17) a. #There is someone that Mary is taller than.
b. Mary’s height exceeds the largest degree of tallness reached by
one of the others.
(18) Mary is taller than John or Fred are.
(19) a. ?#For either John or Fred: Mary is taller than that person.
b. Mary’s height exceeds the maximum height reached by John or
Fred.
This is the interpretive behaviour of some modals (e.g. need, have to, be
allowed, be required), some indefinites (especially NPIs) and disjunction (com-
pare once more Heim 2006b). It is also the behaviour of negation and negative
quantifiers, with the added observation that the apparent narrow scope read-
ing is one which often gives rise to undefinedness, hence unacceptability (von
Stechow 1984; Rullmann 1995). (That this is not invariably the case is shown
by (22), illustrating that we are concerned with a constraint on meaning rather
than form.)
(20) *John is taller than no girl is.
1:8
(21) a. John’s height exceeds the maximum height reached by no girl.

The maximum height reached by no girl is undefined, hence:
unacceptability of this reading.
b. #There is no girl who John is taller than.
(22) I haven’t been to the hairdresser longer than I haven’t been to the
dentist.
Here is how the empirical picture presents itself from the point of view of
a classical analysis of comparatives. It appears that there are two different
scope readings possible for quantifiers embedded inside the than-clause,
wide or narrow scope relative to the comparison. But there is usually no
ambiguity. Each individual quantifier favours at most one reading (negation
frequently permits none). Apparent narrow scope readings are straightfor-
wardly captured by the classical analysis. It is unclear how apparent wide
scope readings are to be derived at all. As Schwarzschild & Wilkinson argue,
they are beyond the reach of an LF analysis. It is also unclear what creates
the pattern in the readings that we have observed.
Before we examine modern approaches to this problem, a final comment
on the data. I have presented them the way they are presented in the literature
on the subject, as if they were all impeccable and their interpretations clear.
But I would like to use this opportunity to point out that I find some of
them fairly difficult and perhaps not even entirely acceptable. This concerns
example (6), for which I would much prefer a version with a definite plural (the
girls instead of every girl). The NP the girls is, if anything, more problematic
under the classical analysis, as Schwarzschild & Wilkinson (2002) point out
(having less of an inclination towards wide scope); but see Section 4 for a
comment on how this issue may be relevant for the analysis developed in
this paper.
(6000 ) a. ?John is taller than every girl is.

b. John is taller than the girls are.
∀x[x ∈ the girls → John is taller than x]
Another instance are examples with intensional verbs like predict or expect;
when a genuine range is predicted or expected, intuitions regarding when
sentences with differentials like (800 ) would be true vs. false are not very firm.
This seems to me an area in which a proper empirical study might be helpful.
The issue is taken up in Section 3.4.
1:9
Sigrid Beck
(800 ) a. John is two inches taller than I had predicted (that he would be).
b. John arrived at most 10 minutes later than I had expected.
2.2 New analyses I
Since it is very hard to see how the data can be derived under the classical
theory, the two theories summarized below (Schwarzschild & Wilkinson 2002
and Heim 2006b) both change the semantics of the comparative construction
in ways that reanalyse scope. The quantificational element inside the than-
clause can take scope there even under the apparent wide scope reading.
The two theories differ with respect to the semantics they attribute to the
comparison itself. They also differ in their empirical coverage.
2.2.1 Schwarzschild & Wilkinson 2002
Schwarzschild & Wilkinson (2002) are inspired by the scope puzzle to a

complete revision of the semantics of comparison. The feature of the classical
analysis that they perceive as the crux of our problem is that the than-clause
provides a degree via abstraction over degrees. According to them, the
quantifier data show that the than-clause instead must provide us with an
interval on the degree scale — in (23) below an interval into which the height
of everyone other than Caroline falls.
(23) Caroline is taller than everyone else is.

‘Everyone else is shorter than Caroline.’
interval that covers everyone else’s height

_ _ _ _ _ _ _• _ _ _ _•_ _ _ _ •_ _ _ _ _ _ _ _• _ _ _/
x1 x2 x3 C
(the interval is related to Caroline’s height by the comparative)

(24) than everyone else is = λD. everyone else’s height falls within D
(where D is of type hd, ti)2
To simplify, I will suppose that it is somehow ensured that we pick the right
matrix clause interval (Caroline’s height in (23), Joe’s height in the example
2 I present the discussion here in terms of the classical theory’s ontology, where degrees (type
d, elements of Dd ) are points on the degree scale and what I call an interval is a set of points,
type hd, ti.
1:10
below).
(25) Joe is taller than exactly 5 people are.
Here is a rough sketch of Schwarzschild & Wilkinson’s analysis of this exam-

ple.
(26) Subord: [λD. exactly 5 people’s height falls within D]

Matrix + Comp: max D 0 : [Joe’s height − D 0 ] 6= 0
the largest interval some distance below Joe’s height
Whole clause: the largest interval some distance below Joe’s height
is an interval into which exactly 5 people’s height falls.
Note that the quantifier is not given wide scope over the comparison at all
under this analysis. The interval idea allows us to interpret it within the
than-clause. While solving the puzzle of apparent wide scope operators, the
analysis makes wrong predictions for apparent narrow scope quantifiers (cf.
example (27)). The available reading cannot be accounted for ((28a) is the
semantics predicted by the classical analysis, corresponding to the intuitively
available reading; (28b) is the semantics that the Schwarzschild & Wilkinson
analysis predicts).
(27) John is taller than anyone else is.

(28) a. John’s height > max(λd. ∃x[x 6= John & x is d-tall])
b. #The largest interval some distance below John’s height is an
interval into which someone else’s height falls = Someone is
shorter than John.
The breakthrough achieved by this analysis is that we can assign to the than-
clause a useful semantics while interpreting the quantifier inside that clause.
For this reason, the interval idea is to my mind a very important innovation.
The analysis still has a crucial problem in that it does not extend to the
apparent narrow scope quantifiers. That is, it fails in precisely those cases
that were unproblematic for the classical analysis. I will also mention that
the semantics of comparison becomes rather complex under this analysis,
since the comparative itself compares intervals. This is not in line with the
plot I outlined above of maintaining as the semantics of the comparative
operator the plain ‘larger than’-relation.
1:11
Sigrid Beck
2.2.2 Heim 2006b
Heim (2006b) adopts the interval analysis, but combines it with a scope
mechanism that derives ultimately a wide and a narrow scope reading of
a quantifier relative to a comparison. Her analysis extends proposals by
Larson (1988). Larson’s own analysis is only applicable to than-clauses with
an adjective phrase gap denoting a property of individuals — a limitation
remedied by Heim. Let us consider her analysis of apparent wide scope of
quantifier data, like (29), first. Heim’s LF for the sentence is given in (30). She
employs an operator Pi (Point to Interval, credited to Schwarzschild (2004)),
whose semantics is specified in (31). Compositional interpretation (once more
somewhat simplified for the matrix clause, for convenience) is given in (32).
(29) John is taller than every girl is.

(30) [ IP [ CP than [1 [every girl [2 [ AP [Pi t1 ] [3 [ AP t2 is t3 tall]]]]]]]
[ IP 4 [[-er t4 ] [5 [John is t5 tall]]]]
(31) Pi = λD.λP . max(P ) ∈ D

(32) a. main clause:
[[[4 [[-er t4 ] [5 [John is t5 tall]]]]] = λd. John is taller than d
b. than-clause:
[than [1 [every girl [2 [ AP [Pi t1 ] [3 [ AP t2 is t3 tall]]]]]]] =
D 0 /1
λD 0 . [every girl [2 [ AP [Pi t1 ] [3 [ AP t2 is t3 tall]]]]]]]g =
0 x/2
0 g D /1
λD . ∀x[girl(x) → [ AP [Pi t1 ] [3 [ AP t2 is t3 tall]]] ]=
x/2
λD . ∀x[girl(x) → [λD.λP . max(P ) ∈ D](D )([3 [t2 is t3 tall]]g )] =
0 0
λD 0 . ∀x[girl(x) → [λD.λP . max(P ) ∈ D](D 0 )(λd. Height(x) ≥

d)] =
λD 0 . ∀x[girl(x) → max(λd. Height(x) ≥ d) ∈ D 0 ] =
λD 0 . ∀x[girl(x) → Height(x) ∈ D 0 ]
intervals into which the height of every girl falls
c. main clause + than-clause:
(29) =
[λD 0 . ∀x[girl(x) → Height(x) ∈ D 0 ]](λd. John is taller than d) =
∀x[girl(x) → Height(x) ∈ (λd. John is taller than d)] =

for every girl x: John is taller than x
1:12
The than-clause provides intervals into which the height of every girl falls.
The whole sentence says that the degrees exceeded by John’s height is such
an interval. Semantic reconstruction (i.e. lambda conversion) simplifies the
whole to the claim intuitively made, that every girl is shorter than John. The
analysis assumes that the denotation domain Dd is a set of degree ‘points’,
and that intervals are of type Dhd,ti .
The analysis is a way of interpreting the quantifier inside the than-clause,
and deriving the apparent wide scope reading over the comparison through
giving the quantifier scope over the shift from degrees to intervals (the Pi
operator). It is applicable to other kinds of quantificational elements like
intensional verbs in the same way. Our example with predict is analysed
below; the intuitively plausible reading can now be derived straightforwardly
from the LF in (34).
(33) a. John is taller than I had predicted (that he would be).

b. ∀w[wR@ → max(λd. John is d-tall in @) >
max(λd. John is d-tall in w)]
= For every world compatible with my predictions:
John’s actual height exceeds John’s height in that world.
(34) [ IP [ CP than [1 [I had predicted [ CP [Pi t1 ] [2 [ AP John t2 tall]]]]]]]
[ IP 3 [John is taller than t3 ]]]
[3 [John is taller than t3 ]] = (λd. John is taller than d in @)
b. than-clause:
[than [1 [I had predicted [ CP [Pi t1 ] [2 [ AP John t2 tall]]]]]]] =
0
[λD 0 . ∀w[wR@ → [ CP [Pi t1 ] [2 [ AP John t2 tall]]]g[D /1] ] =
[λD 0 . ∀w[wR@ → max(λd. Height(John)(w) ≥ d) ∈ D 0 ]] =
[λD 0 . ∀w[wR@ → Height(John)(w) ∈ D 0 ]]
intervals into which John’s height falls in all my predictions
(34) =
[λD 0 . ∀w[wR@ → Height(John)(w) ∈ D 0 ]]
(λd. J is taller than d in @)
= for every w compatible with my predictions:
John’s actual height exceeds John’s height in w.
The effect of the Pi operator on the predicate of degrees it combines with

is sketched below for the AP tall. As long as a than-clause quantifier takes
1:13
Sigrid Beck
scope over the Pi operator, the resulting meaning of the whole sentence will
be one that lets the quantifier take scope over the comparison, even though
it is interpreted syntactically below the comparative operator and inside the
than-clause.
(36) Pi shifts from degrees to intervals:

[λd. Height(x) ≥ d] =⇒ [λD. Height(x) ∈ D]
In contrast to Schwarzschild & Wilkinson’s original interval analysis, Heim

is able to derive apparently narrow scope readings of an operator relative to
the comparison as well. The sentence in (37a) is associated with the LF in (38).
Note that here, the shifter takes scope over the operator have to. This makes
have to combine with the degree semantics in the original, desired way,
giving us the minimum compliance height (just like it did before, without the
intervals). The shift is essentially harmless.
(37) a. Mary is taller than she has to be.

> max(λd. ∀w[wR@ → Mary is d-tall in w])
Mary’s actual height exceeds the degree of tallness which she has
in all worlds compatible with the school rules;
(38) [ IP [ CP than [1 [[[Pi t1 ] [2 [has-to [Mary t2 tall]]]]]]]
[ IP 3 [Mary is taller than t3 ]]]
[3 [Mary is taller than t3 ]]] = (λd. Mary is taller than d in @)
b. than-clause:
[than [1 [[[Pi t1 ] [2 [has-to [Mary t2 tall]]]]]] =
0
λD 0 . [[Pi t1 ] [2 [has-to [Mary t2 tall]]]]]]g[D /1] =
λD 0 . max(λd. has-to [Mary t2 tall]]]]]]g[d/2] ) ∈ D 0
λD 0 . max(λd. ∀w[wR@ → Mary is d-tall in w]) ∈ D 0
intervals into which the required minimum falls
(38) =
[λD 0 . max(λd. ∀w[wR@ → Mary is d-tall in w]) ∈ D 0 ]
(λd. Mary is taller than d in @)
= Mary is taller than the required minimum.
1:14
Other apparent narrow scope operators receive a parallel analysis. The

crucial ingredient to this analysis is that the Pi operator is a scope bearing
element, able to take local or non-local scope. Pi-phrase scope interaction is
summarized below:
(40) Pi takes narrow scope relative to quantifier

=⇒ apparent wide scope reading of quantifier over comparison
Pi takes wide scope relative to quantifier
=⇒ apparent narrow scope reading of quantifier relative to comparison
Thus than-clauses include a shift from degrees to intervals, which allows

us to assign a denotation to the than-clause with the quantifier. The shift
amounts to a form of type raising. Through semantic reconstruction, the
matrix clause is interpreted in the scope of a than-clause operator when that
operator has scope over the shifter. In contrast to Schwarzschild & Wilkinson,
comparison is ultimately between degrees, not intervals.
Heim’s analysis is able to derive both wide and narrow scope readings of
operators in than-clauses. It does so without violating syntactic constraints.
There is, however, an unresolved question: when do we get which reading?
How could one constrain Pi-phrase/operator interaction in the desired way?
One place where this problem surfaces is once more negation, where we
expect an LF that would generate an acceptable wide scope of negation
reading. That is, the LF in (41b) should be grammatical and hence (41a)
should be acceptable on the reading derived from this LF in (42).
(41) a. *John is taller than no girl is.

b. [ IP [ CP than [1 [no girl [2 [[Pi t1 ] [3 [ t2 is t3 tall]]]]]]]
[ IP 4 [[-er t4 ] [5 [John is t5 tall]]]]
[4 [[-er t4 ] [5 [John is t5 tall]]] = λd. John is taller than d
b. than-clause:
[than [1 [no girl [2 [[Pi t1 ] [3 [ t2 is t3 tall]]]]]]] =
λD 0 . for no girl x : max(λd. x is d-tall) ∈ D 0
intervals into which the height of no girl falls
(41b) =
[λD 0 . for no girl x : max(λd. x is d-tall) ∈ D 0 ](λd. J is taller than d)
= for no girl x: John is taller than x
1:15
Sigrid Beck
Adopting the interval analysis, but combining it with a scope mechanism

and semantic reconstruction, allows Heim to derive both types of readings
(apparent narrow and apparent wide scope), and to reduce the comparison
ultimately back to a comparison between degrees. Thus her empirical cover-
age is greater and the semantics of comparison simpler than Schwarzschild &
Wilkinson’s analysis. The problem that this analysis faces is overgeneration.
We do not have an obvious way of predicting when we get which reading.
The fact that in general, only one scope possibility is available makes one
doubt that this is really a case of systematic scope ambiguity.
2.3 Alternative new analyses: Gajewski, van Rooij, Schwarzschild
There is a group of new proposals — Gajewski 2008, van Rooij 2008 and
Schwarzschild 2008 — for how to deal with quantifiers in than-clauses whose
approach seems to be inspired by Heim’s (2006b) analysis. I present below a
simplified version of this family of approaches that is not entirely faithful
to any of them. I call this the NOT-theory. It can be summarized in relation
to the previous subsection as ‘keep the than-clause internal operator, but
not the intervals’. It adopts the idea that there is an operator — like Heim’s
Pi — that can take wide or narrow scope relative to a than-clause quantifier,
dictating what kind of reading the comparative sentence receives. It does not
adopt an interval analysis, and thus the operator is not Pi and the semantics
of the comparative is not the classical one. Instead, the operator is negation
and the proposed semantics is basically Seuren’s (1978).
2.3.1 Seuren’s semantics for the comparative (operator: NOT)
Seuren (1978) suggests (43b) as the interpretation of (43a). The than-clause

provides the set of degrees of tallness that Bill does not reach. It does so
by virtue of containing a negation, as illustrated in the LF in (44). This
meaning could be combined intersectively with the main clause and the
degree existentially bound, as represented in (45).
(43) a. John is taller than Bill is.

b. ∃d[Height(J) ≥ d & ¬ Height(B) ≥ d]
c. There is a degree of tallness that John reaches and Bill doesn’t
reach.
1:16
(44) a. than λd[NOT Bill is d-tall]

b. λd[¬ Height(B) ≥ d] = λd[Height(B) < d]
(45) [∃ [λd [John is d-tall] [than λd [NOT Bill is d-tall]]]
The authors mentioned above note that this semantics gives us an easy
way to derive the intuitively correct interpretation for apparent wide scope
quantifiers. This is illustrated below for the universal NP. In (46) I show that
the desired meaning is easily described in this analysis and in (47) I provide
the LF for the than-clause that derives it. (48) illustrates that some, another
apparent wide scope quantifier, is equally unproblematic.
(46) a. John is taller than every girl is.

b. ∃d[Height(J) ≥ d & ∀x[girl(x) → Height(x) < d]]
c. every girl is shorter than John.
(47) a. than every girl is
b. than λd [every girl [1 [NOT [t1 is d tall]]]]
c. than λd.∀x[girl(x) → Height(x) < d]]
_ _ _ _• _ _ _ _ _ _ _•_ _ _ _ •_ _ _ _ _ _ _ •
_ _ _ _ _ _ _ _ _ _ _/
g1 g2 g3 g4
...
(48) a. John is taller than some girl is.
b. ∃d[Height(J) ≥ d & ∃x[girl(x) & Height(x) < d]]
c. there is a girl who is shorter than John.
An interesting application is negation, here illustrated with the negative

quantifier no. Proceeding in the now familiar way, we derive (49b). Rephrasing
this in terms of (49c) makes it clear that the resulting semantics is very weak.
Whenever the girls have any measurable height at all — that is, whenever the
than-clause can be appropriately used — there will be a height degree that
John reaches and that all the girls reach as well. The smallest degree on the
scale will be such a degree. The NOT-theory proposes that the sentence is
unacceptable because it is necessarily uninformative.

b. ∃d[Height(J) ≥ d & for no girl x : Height(x) < d]
c. ∃d[Height(J) ≥ d & for every girl x : Height(x) ≥ d]
uninformative!
(The lowest degree on the height scale makes this true.)
1:17
Sigrid Beck
2.3.2 NOT has to take varying scope
The NOT-theory needs another important ingredient: Just like the Pi-operator
above, other than-clause internal operators have to take flexible scope relative
to NOT in order to create the different readings we observe. This is illustrated
below with the familiar have to example, and with allowed.
(50) a. Mary is taller than she has to be.

b. #∃d[Height(M)(@) ≥ d & ∀w[wR@ → NOT Height(M)(w) ≥ d]]
Mary should have been shorter than she is.
c. ∃d[Height(M)(@) ≥ d & NOT∀w[wR@ → Height(M)(w) ≥ d]]
Mary is taller than the minimally required height.
(51) a. John is taller than he is allowed to be.
b. #∃d[Height(J)(@) ≥ d & ∃w[wR@ & NOT Height(J)(w) ≥ d]]
∃d[Height(J)(@) ≥ d & ∃w[wR@ & Height(J)(w) < d]]
John would have been allowed to be shorter than he is.
c. ∃d[Height(J)(@) ≥ d & NOT∃w[wR@ & Height(J)(w) ≥ d]]
John is taller than the tallest permissible height.
(52) a. #than λd [allowed [λw [NOT [John is d tall in w]]]]
b. than λd [NOT [allowed [λw [John is d tall in w]]]]
Just like the Pi-theory, then, the NOT-theory is able to generate the range
of readings we observe for operators in than-clauses. It seems somewhat
simpler than the Pi-theory in that it does not take recourse to intervals in
addition to a scopally flexible than-clause internal operator. But as in the case
of the Pi-theory, we must next ask ourselves what prevents the unavailable
readings, e.g. what excludes the LF in (52a).
2.3.3 Which reading?
The NOT-theory would have an empirical advantage over the Pi-theory if con-
straints on scope could be found to deal with the overgeneration problem we
noted above. A first successful application are polarity items. Example (53a)
can only have the LF in (54b), not the one in (54a), according to constraints on
the distribution of NPIs. Thus we only derive the approproate interpretation.
Note though that the Pi-theory has the same success since the scope of Pi
is a downward entailing environment, but the rest of the than-clause isn’t
(compare Heim 2006b). (55) is the mirror image.
1:18
(53) a. John is taller than any girl is.

b. #∃d[Height(J) ≥ d & ∃x[girl(x) & Height(x) < d]]
there is a girl who is shorter than John.
c. ∃d[Height(J) ≥ d & NOT ∃x[girl(x) & Height(x) ≥ d]]
John reaches a height degree that no girl reaches.
= John is taller than every girl.
(54) a. *than λd [any girl [1 [NOT [t1 is d tall]]]]
b. than λd [NOT [any girl [1 [t1 is d tall]]]]
(55) John is taller than some girl is.
Let us next reexamine negation. Two interpretations need to be considered.

The one in (56b) was already rejected above as uninformative. It turns out that
the alternative interpretation is equally uninformative. The ungrammaticality
of negation in than-clauses is thus captured elegantly by this theory. Here it
has an advantage over the Pi-theory.

b. ∃d[Height(J) ≥ d & for no girl x : Height(x) < d] uninformative
c. ∃d[Height(J) ≥ d & NOT for no girl x : Height(x) ≥ d] uninfor-
mative
= ∃d[Height(J) ≥ d & some girl x : Height(x) ≥ d]
Among the proponents of the NOT-theory, Schwarzschild (2008) examines

modals. He argues that the NOT-theory predicts that modals in than-clauses
should give rise to the same reading that they have with ordinary clause-mate
negation. This prediction is borne out, as the examples below illustrate.
(57) a. John is not allowed to be that tall. NOT allowed

b. than he is allowed to be.
(58) a. John might not be that tall. might NOT
b. than he might be.
(59) a. John is not supposed to be that tall. supposed NOT
b. than he is supposed to be.
(60) a. John is not required to be that tall. NOT required
b. than he is required to be.
While this is helpful with modals, it stops short of explaining the interpreta-
tion associated with intensional full verbs like predict.
1:19
Sigrid Beck
(61) a. John was not predicted to be that tall. NOT predict — #

b. than he was predicted to be.
Two further possible constraints are discussed. Van Rooij (2008) examines
universal DPs and Gajewski (2008) investigates numeral DPs. Let us consider
both in turn.
Note first that a universal DP is ambiguous relative to clause mate nega-
tion. In particular it allows a reading in which the universal takes narrow
scope relative to negation. Thus there are no inherent scope constraints that
would help us to exclude (630 b) as an LF of (63a). But exclude it we must,
since it gives rise to the unavailable reading (63c).
(62) a. Every girl isn’t that tall. ambiguous

b. than every girl is.
b. ∃d[Height(J) ≥ d & ∀x[girl(x) → Height(x) < d]]
‘Every girl is shorter than John.’
c. #∃d[Height(J) ≥ d & NOT ∀x[girl(x) → Height(x) ≥ d]]
‘John reaches a height that some girl doesn’t.’
= John is taller than the shortest girl.
(630 ) a. than λd [every girl [1 [NOT [t1 is d tall]]]]
b. *than λd [NOT [every girl [1 [t1 is d tall]]]]
Van Rooij observes that (630 a) yields stronger truth conditions than (630 b). He
proposes that if no independent constraint excludes one of the LFs, you have
to pick the one that results in the stronger truth conditions. This amounts to
the suggestion that than-clauses fall within the realm of application of the
Strongest Meaning Hypothesis (SMH; Dalrymple, Kanazawa, Kim, Mchombo &
Peters 1998). If they do, the NOT-theory can make the desired predictions
about every DPs (and some other relevant examples). So could the Pi-theory,
though, so this does not distinguish between the two scope based theories of
quantifiers in than-clauses.
While I am sympathetic to the idea of extending application of the SMH, I
see some open questions for doing so in the case of than-clauses. Dalrymple
et al. originally proposed the SMH to deal with the interpretation of recipro-
cals. (64a) receives a stronger interpretation than (64b), for example, because
the predicate to stare at makes it factually impossible for the reading of (64a)
to ever be true. Similarly for (64c) vs. (64a,b). But (64a) only has one inter-
1:20
pretation, the strongest one, and (64b) also cannot have a reading parallel to
(64c). The SMH says, very roughly, that out of the set of theoretically possible
interpretations you choose the strongest one that has a chance of resulting
in a true statement, i.e. that is conceptually possible.3
(64) a. These three people know each other.

= everyone knows everyone else.
b. These three people were staring at each other.
= everyone was staring at someone else.
c. These three people followed each other into the elevator.
= everyone followed, or was followed by, someone else.
There is a theoretical question as to when the SMH applies. We would not

wish it to apply in (62) for instance because it would predict that there is
no ambiguity. When there is ambiguity, the data in question must not be
subject to the SMH. Are than-clauses in the domain of application of the SMH?
Prima facie, this seems very plausible, because — just like reciprocals — they
are (almost always) unambiguous, while semantic theory provides several
potential interpretations. What strikes me as problematic is that there is
no way to make the weaker reading emerge, even if the stronger reading is
conceptually impossible. The following sentences are necessarily false, rather
than having the interpretations indicated.
(640 ) a. (about a 100 m race:)

The next to last finalist was faster than every other finalist.
≠ the next to last finalist was faster than the slowest other finalist.
3 Below I provide the formulation of the SMH given in Beck 2001. If we extend the domain of
application of the SMH to than-clauses, we need to strike out those phrases that make explicit
reference to reciprocals, as indicated. The relevant point is that the SMH makes reference
to interpretations compatible with non-linguistic information I, which in the examples in
(640 ) below would be knowledge about the order of finalists, elevator buttons and weekdays,
parallel to knowledge about processions of people and possibilities for staring in (64).
(i) Strongest Meaning Hypothesis (SMH)

Let Sr be the set of theoretically possible reciprocal interpretations for a sentence
S. Then, S can be uttered felicitously in a context c, which supplies non-linguistic
information I relevant to the reciprocal’s interpretation, provided that the set Sc has
a member that entails every other one.
Sc = {p: p is consistent with I and p ∈ Sr }
In that case, the use of S in c expresses the logically strongest proposition in Sc .
1:21
Sigrid Beck
b. (in an elevator:)
The second button from the bottom is higher than every other
button.
≠ the second button from the bottom is higher than the lowest
other button.
c. Friday is earlier than every other day of the week. ≠ Friday is
earlier than the latest other day of the week.
Thus than-clauses do not seem parallel to reciprocals. It would be better if an

LF that gives rise to the ‘the least . . . other’ reading for universal DPs simply
did not exist.
Turning now to numeral DPs, note first that it is not immediately obvious
how the NOT-theory predicts a plausible meaning for them at all. Gajewski
(2008) points out that the following analysis of exactly-DPs gives rise to truth
conditions that are too weak. (650 ) would be true in a situation in which more
than three girls stay below John’s height.
(65) John is taller than exactly three girls are.

(650 ) ∃d[Height(J) ≥ d & for exactly 3 girls x : Height(x) < d]
At least three girls are below John’s height
_ _ _ • _ _ _ _ • _ _ _ _ • _ _ _ _ _ _ _ • _ _ _ _ • _ _ _ _ • _ _ _ _/
g1 g2 g3 g4 g5
d H(J)
Reversing the scope of NOT and the exactly-DP doesn’t help:
(6500 ) ∃d[Height(J) ≥ d & NOT for exactly 3 girls x : Height(x) ≥ d]

there is a degree of height that John reaches that is not reached by
exactly 3 girls,
i.e. fewer or more girls reach that degree
true e.g. if John is taller than every one of five girls
Gajewski develops an analysis that relies on Krifka’s (1999) work on exactly,

at least and at most, according to which these elements take effect at the level
of the utterance, far away from their surface position. I present this analysis
in simplified terms below, using (66) to illustrate. The semantic effect of
exactly is due to an operator I call EXACT, which applies at the utterance
level and operates on the basis of the ordinary as well as the focus semantic
1:22
value of its argument. The operator’s semantics is given in (67). The truth
conditions derived for the example are the right ones, as shown in (68) ((68)
uses Link’s (1983) operator ∗ for pluralization of the noun).
(66) a. Exactly three girls weigh 50 lb.

b. [EXACT [XP (exactly) threeF girls weigh 50 lb.]]
(660 ) threeF girls weigh 50 lb.o =
∃X[∗ girl(X) & card(X) = 3 & ∗weigh.50.lb(X)]
threeF girls weigh 50 lb.f =
{∃X[∗ girl(X) & card(X) = n & ∗weigh.50.lb(X)] : n ∈ N}
(67) EXACT(XPf )(XPo ) = 1
iff XPo = 1 & ∀q ∈ XPf : ¬(XPo → q) → ¬q
’Out of all the alternatives of XP, the most informative true one is the
ordinary semantics of XP.’
(68) (66b) = 1 iff
∃X[∗ girl(X)&card(X) = 3 & ∗weigh.50.lb(X)] &
∀n[n > 3 → ¬∃X[∗ girl(X) & card(X) = n & ∗weigh.50.lb(X)]] iff
max(λn.∃X[∗ girl(X) & card(X) = n & ∗weigh.50.lb(X)]) = 3
Krifka’s analysis of exactly allows us to assign the problematic example (65)

the LF in (69), which captures the right meaning, namely the interpretation in
(690 ).
(69) EXACT [∃ [λd [John is d-tall] [than λd [threeF girls [λx [NOT x is
d-tall]]]]
(690 ) max(λn. ∃d[Height(J) ≥ d & for n girls x : Height(x) < d])
the largest number n such that John reaches a height that n girls
don’t is 3. = exactly three girls are shorter than John.
Thus independently motivated assumptions about numerals allow the NOT-

theory to derive the desired interpretation. However, there is still the question
of the other LF, (70), in which NOT takes scope over the DP. This gives rise to
interpretation (700 ).
(70) EXACT [∃ [λd [John is d-tall] [than λd [NOT threeF girls [λx [x is
d-tall]]]]
(700 ) max(λn. ∃d[Height(J) ≥ d & NOT for n girls x : Height(x) ≥ d])
the largest number n such that there is a height John reaches and it’s
1:23
Sigrid Beck
not the case that n girls do is 3.

= exactly two girls are shorter than John.
The reasoning in (71) makes it clear that this reading leads to truth conditions
that do not correspond to an available reading; they would make the sentence
true in the situation depicted, where there are two girls shorter than John.
(71) a. ∃d[Height(J) ≥ d & NOT for n girls x : Height(x) ≥ d]

= ∃d[Height(J) ≥ d & fewer than n girls reach d]
b. ∃d[Height(J) ≥ d & fewer than 3 girls reach d]
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _/
• • • • • •
g1 g2 g3 g4 g5
H(J)
The NOT-theory would have to come up with an explanation for why this
reading is unavailable. I am not aware that there is at present such an
explanation. Note that even if we didn’t have the reservations about the SMH
pointed out above, it would not apply here, as the two interpretations don’t
stand in an entailment relation.
To summarize: just like the Pi-theory, the NOT-theory faces an overgen-
eration problem. Both the Pi-theory and the NOT-theory solve this easily
regarding NPIs. The NOT-theory also has a simple story about modals and
negative quantifiers. It does not have an explanation for intensional full verbs
and numeral DPs, and I argue it does not have a story about universal DPs (or
other prospective applications of the SMH) either. Thus I see some progress
compared to the Pi-theory, but not a complete analysis. A conceptual advan-
tage seems to be the NOT-theory’s simplicity. But we will need to reexamine
that in the next subsection.
2.3.4 Reference to degrees — differentials
One of the strengths of the classical analysis of comparatives is the way in

which it deals with explicit reference to degrees. For example differentials in
comparatives, illustrated in (72) and (73), receive an easy and natural analysis.
(72) a. Bill is 1.70 m tall.

b. John is 200 taller than that.
c. Height(J) ≥ 200 + 1.70 m
1:24
(73) a. John is 200 taller than Bill is.

b. Height(J) ≥ 200 + max(λd. Height(B) ≥ d)
= Height(J) ≥ 200 + Height(B)
It is not obvious how to incorporate differentials into the NOT-theory, whose

semantics of a simple example is repeated in (74). That is because the
than-clause does not refer to a degree.

b. ∃d[Height(J) ≥ d & NOT Height(B) ≥ d]
Among the proponents of the NOT-theory, Schwarzschild (2008) discusses

this problem. He proposes to understand (75a) in terms of (75b); I simplify
this to (75c) for the purposes of discussion.
(75) a. John is 200 taller than Bill is.

b. ∃d[Height(J) ≥ d & 200 (λd0 . d0 ≤ d & Height(B) < d0 )]
c. 200 (λd0 . d0 ≤ Height(J) & Height(B) < d0 )])
“the degrees between Bill’s height and John’s are a 200 interval”
200 interval
_ _ _ _ • _ _ _ _ _ _ _ _ _ _ _ _ _ _ • _ _ _ _/
H(B) H(J)
The question is how to derive this interpretation. Schwarzschild proposes to

replace NOT in the than-clause with an operator FALL-SHORT. The resulting
LF of our example is given in (750 a) and the semantics of FALL-SHORT in
(750 b). Diff is a variable that is the first argument of FALL-SHORT, to be
bound outside the than-clause and identified with the differential in the
matrix clause (as if the differential was raised out of the embedded clause to
its main clause position).
(750 ) a. than [[FALL-SHORT Diff] λd [Bill is d-tall]]

b. FALL-SHORT = λDiff.λDh d, ti.λd. Diff(λd0 . d0 ≤ d & D(d0 ) = 0)
c. Diff(λd0 . d0 ≤ d& Height(B) < d0 )
Bill’s Height is a Diff-large distance below d
We combine with the differential next, as shown in (76). Then, the degree d is
bound and the usual semantic mechanisms combine this with the rest of the
main clause in (77). This derives (75).
1:25
Sigrid Beck
(76) a. [200 er] [λDiff [than Bill is tall]]

b. λd. 200 (λd0 . d0 ≤ d& Height(B) < d0 )]
Bill’s Height is a 200 distance below d
(77) [∃ [λd [John is d-tall] [200 er] [λDiff [than Bill is tall]]]
It seems to me that this is a rather substantial modification of the original

NOT-theory. The basic points about than-clause scope interaction remain the
same (as the reader may verify), but some of the explanation is less obvious.
In particular, I don’t see that scopal behaviour of a modal with same clause
negation necessarily predicts scopal behaviour relative to FALL-SHORT, any
more than it predicts scopal behaviour relative to Pi. I also believe that we
lose the explanation of the unacceptability of negative quantifiers. Neither
of the readings associated with the two possible LFs below is necessarily
uninformative. Finally, I no longer see that the FALL-SHORT-theory is simpler
than the Pi-theory.
(78) a. John is 200 taller than no girl is.

b. [∃ [λd [J. is d-tall] [200 er] [λDiff [than [[FALL-SHORT Diff] λd [no
girl is d-tall]]]
c. [∃ [λd [J. is d-tall] [200 er] [λDiff [than [no girl λx [FALL-SHORT
Diff] λd [x is d-tall]]]]]
(780 ) ∃d[Height(J) ≥ d&200 (λd0 . d0 ≤ d & [λd. no girl is d-tall](d0 ) = 0)]]
= ∃d[Height(J) ≥ d & 200 (λd0 . d0 ≤ d & some girl is d0 -tall])]
= John and some girl are at least two inches tall.
(7800 ) ∃d[Height(J) ≥ d & no girl x : 200 (λd0 . d0 ≤ d& Height(x) < d0 )]
= no girl is 200 shorter than John.
I conclude that while the type of analysis discussed in this section — what
one might call scopal theories of quantifiers in than-clauses — has brought
forth some very interesting ideas, there are also unanswered questions. It
may be worthwhile to pursue a scopeless alternative, which is what I will do
in the next section.
3 Analysis: Selection
The strategy I propose in this section is inspired by both Schwarzschild &

Wilkinson and Heim. Schwarzschild & Wilkinson’s use of intervals is retained
in order to be able to interpret a quantifier inside a than-clause. But like Heim,
1:26
I attempt to make this move compatible with a simple, standard semantics

of the comparative. The novel aspect of the analysis below concerns how
this is done. I do not adopt a than-clause internal operator Pi and I do not
rely on semantic reconstruction. I propose instead that there is a mechanism
that derives a particular degree from an interval provided by the than-clause.
This degree is compared in the normal way with a matrix clause degree.
The trick will be to ensure that the degree chosen is the right one, i.e. that
the comparison ultimately made reflects the intuitively accessible reading of
the comparative sentence in question. The same selection mechanism will
account for both apparent wide scope and apparent narrow scope readings.
The analysis will not employ a scoping mechanism that is specific to compar-
atives. Its relation to the earlier work discussed above can be simply stated
as ‘keep the intervals, but not the operator’.
Two rationales guide me in pursuing this approach. The first is that a
scoping mechanism inside the than-clause overgenerates in ways that we have
yet to find the means of constraining. Therefore it would be an advantage to
make do without such an extra scopal element. The second is that it remains
a strength of the classical analysis that degree operators combine directly
with expressions referring to degrees, and that differentials in particular can
be accounted for in a direct and straightforward way. Therefore I want to
come out of the calculation of the semantics of the than-clause holding in
my hand the degree we will be comparing things to. The combination of
these two lines of reasoning persuades me to attempt a simplification of
Schwarzschild & Wilkinson, which should of course also cover the apparent
narrow scope data that were problematic for them.
Section 3.1 presents the idea behind the selection analysis and applies it to
straightforward cases. Apparent narrow scope universals are not straightfor-
ward and addressed in Section 3.2. Apparent wide scope existentials similarly
seem problematic and are the issue of Section 3.3. In Section 3.4 I reexamine
comparatives that combine a differential with a quantifier in the than-clause
and propose a refinement of the analysis of the comparative to capture the
data.
3.1 Basic idea and simple cases
I illustrate the idea behind the selection analysis with example (79), which
would not in fact require intervals at all of course. But, suppose that we in
general compositionally derive as the meaning of the than-clause a set of
1:27
Sigrid Beck
intervals, as suggested in the Schwarzschild & Wilkinson and Heim theories.

Suppose furthermore that this comes from the basic lexical entry of the
adjective, as indicated in (80). This is what I will assume in this section, for
the sake of uniformity (see Section 4 for more discussion). It amounts to (790 )
in the present case. How do I propose to derive the truth conditions of (79a),
(79b), from that?

b. Height(John) > Height(Bill)
(790 ) [than Bill is tall] = λD 0 . Height(Bill) ∈ D 0

a.
b. _ _ _ _ _ _ _ _ _ _ _ _ _ •
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _/
H(B) _

Intervals
containing
.. Bill’s height
.
_
(80) tall = λD. λx. Height(x) ∈ D
I suggest that general mechanisms available in such situations enable us — in

fact, force us — to pick from the set of intervals something that is suitable
as the input to the comparative operator repeated in (81). I represent this
selection mechanism as the in (7900 ) for the moment. This subsection asks
what the appropriate meaning for the is. (Note that the term ‘selection’ is
not intended to imply that there is a genuine choice; I intend to provide one
semantics for the.)
(81) -er = λdd . λd0d . d0 > d

(7900 )

John is taller than the than-clause
In the present case, the could be an operator selecting the shortest interval
from the set, i.e. Bill’s height, cf. (82). This seems a natural choice, given that
all other intervals contain extraneous material and that the point that really
‘counts’ is just Bill’s height.
min(phhd,ti,ti ) = ιD. p(D) & ¬∃D 0 . D 0 ⊂ D & p(D 0 )

(82)
(shortest p interval)
1:28
Irene Heim and Danny Fox (p.c.) point out to me that the sense in which
choosing the minimal interval is ‘natural’ is informativity. (83) below states
what the maximally informative propositions out of a set of true propositions
(say, a question meaning) are.
(83) a. m_inf(w)(Q h hs,hhs,ti,tii ) = λq. Q(w)(q) &

i
¬∃q Q(w)(q ) & q 6= q & Q(w)(q0 ) → Q(w)(q)
0 0 0

b. the maximally informative answers to a question Q(w) (Q(w)

the set of true answers to Q in w) is the set of propositions q in
Q(w) such that there is no other proposition q0 in Q(w) such
that Q(w)(q0 ) entails Q(w)(q) (i.e. if q0 is in Q(w) then so is q).
Informativity allows us to capture the fact that an appropriate answer to

(84a) is the true answer that entails all the other true answers, i.e. John’s
maximal speed (for example the proposition that he drove 50 mph), and in
a parallel way the minimum amount of flour that suffices in (84b)(see Heim
1994; Beck & Rullmann 1999).
(84) a. How fast did John drive?

λw. λp. ∃d p(w) & p = λw 0 . John drove d-fast in w 0

{that John drove 50 mph, that John drove 49 mph, that John
drove 48 mph, . . . }
b. How much flour is sufficient?
λw.λp.∃d[p(w)&p = λw 0 .d-much flour is sufficient in w 0 ]
{that 500 g is sufficient, that 501 g is sufficient, that 502 g is
sufficient, . . . }
The definition can be extended to (intensions of) arbitrary sets in the following
way:
(85) h hs,hα,tii ) = λq. p(w)(q)

m_inf(w)(p &
i
¬∃q p(w)(q ) & q 6= q → p(w)(q0 ) & p(w)(q)
0 0 0

The instance of this generalization that we will be interested in is (86).
(86) a. m_inf(w)(p h hs,hhd,ti,tii ) = λD. p(w)(D) &

i
¬∃D p(w)(D ) & D 6= D & p(w)(D 0 ) → p(w)(D)
0 0 0

b. the maximally informative intervals out of a set of intervals p(w)

is the set of intervals D such that there is no other interval D 0 in
p(w) such that p(w)(D 0 ) entails p(w)(D) (i.e. if D is in p(w)
then so is D 0 ).
1:29
Sigrid Beck
Fox & Hackl (2006) argue that we want to extend the definition from the
question case to others in order to capture the similarity between (84a,b)
above and (87a), (88a). (87a) refers to the maximum speed John reached and
(88a) refers to the minimum amount that suffices, both maximally informative
in the sense of (85). The instance in (86) extends the analogy from (84a,b)
and (87a), (88a) to (87b), (88b).
(87) a. the speed that John drove

b. than John drove
(88) a. the amount of flour that is sufficient
b. than is sufficient
Hence, the in (7900 ) is m_inf, which yields a singleton, combined with taking
from a set its only member (here represented with max). We can understand
these operators as semantic ‘glue’ (a term introduced by Partee 1984, see
also von Stechow 1995): operations that have to enter into composition, in
addition to what the syntax strictly speaking provides, in order to make
the sentence parts combinable. Their presence is required by the need for
interpretability.
(79000 ) John is taller than max(m_inf(than-clause))
The simple example allows me to emphasize another aspect of what I call

the selection analysis: there is no choice in ‘selecting’ a point from a set of
intervals. Only one interpretation is possible for (79). The ‘glue’ we have here
is entirely semantic (and not, say, subject to pragmatic variability). Although
we will see in a moment that quantifiers in than-clauses require some more
elaboration, this will be preserved. Selection means, basically, taking from
the minimal interval(s) the maximal element.
3.1.1 Apparent wide scope universals
Let’s return to the now familiar example (89). We take the than-clause to have
the denotation in (890 ).

b. For every girl x: John’s height exceeds x’s height.
1:30
(890 ) [than every girl is tall] = λD 0 . ∀x girl(x) → Height(x) ∈ D 0

interval
into which the height of every girl falls
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _/
• • • •
x1 x2 x3 J
The intuitive truth conditions of (89) can be described as making a compari-

son between John’s height and the end point of the interval into which all
the girls’ heights fall. If John is taller than the tallest girl, he is taller than all
of them. Thus I propose that from the denotation of the than-clause that is
given in (890 ), we first choose the shortest = maximally informative interval
that fits the description (i.e. that covers all the girls’ heights) and then select
the maximal point of that interval.4

(90) John is taller than Max> m_inf(than-clause)
= John is taller than the height of the tallest girl
(91) and (92) below provide the relevant definitions. We extend the notion of
the ordering relation underlying our degree scale from degrees to intervals,
(91). We can then define the maximal element of a set of intervals, and finally
the end point of an interval, (92).
(91) a. ordering of degree points: d > d0

d is larger than d0
I > J iff ∃d d ∈ I & ∀d0 [d0 ∈ J → d > d0 ]

b. ordering of intervals:
I extends beyond J
(92) a. max> := the max relative to the > relation on intervals or degrees
b. Max> (p) := max> (max> (p))
= the end ‘point’ of the interval that extends furthest
We straightforwardly derive the desired meaning. Other universal quantifiers

can be treated in the exact same way. This is illustrated below with the
4 Fox & Hackl propose to replace maximality with maximal informativity. I have not been able
to develop an analysis that incorporates that proposal. The reason is lack of entailment
among the degrees in the minimal than-clause interval: If I know of a degree d that it falls
in between the height of the smallest girl and the height of the tallest girl, I cannot infer
that a degree d0 larger than d also falls within that interval (d0 might be beyond the height
of the tallest girl) and I cannot infer that a smaller degree d00 also falls within that interval
(d00 might be below the height of the shortest girl). Therefore I will use both maximal
informativity and ordinary maximality.
1:31
Sigrid Beck
familiar example containing predict. If my prediction was that John would

be between 1.70 m and 1.80 m tall, then the interval [1.70–1.80] is the unique
shortest interval described by the than-clause. The end point of that interval
is 1.80 m, and the example is correctly predicted to be true if John is taller
than 1.80 m.
(93) a. John is taller than I had predicted (that he would be).

b. For every world compatible with my predictions:
John’s actual height exceeds Johns height in that world.
(930 ) [than I had predicted (that he would be tall)] =
λD 0 . ∀w[wR@ → John’s height in w ∈ D 0 ]
intervals into which John’s height falls in all my predictions
(9300 ) John is taller than Max> (m_inf(than-clause))
= John is taller than the height according to the tallest prediction
What I call selection yields the maximum relative to the ordering relation
linguistically given — ‘larger than’ on the size scale in the case of taller. This
follows from more general interpretive mechanisms suggested independently
(compare Jacobson 1995; Fox & Hackl 2006). Application of these mecha-
nisms is required by the need for the than-clause to serve as input to the
comparative operator.
3.1.2 Apparent narrow scope existentials
We can apply the same strategy to narrow scope existentials. This is illus-
trated with (94) below. In contrast to Heim’s analysis and like Schwarzschild
& Wilkinson’s, I assume that the than-clause denotes the set of intervals in
(940 ) (once more via the shifted lexical entry for the adjective, (80)). Impor-
tantly, remember that I assume that the shift to intervals must take place
locally, i.e. within the adjective phrase. I do not assume a genuine mobile op-
erator Pi like Heim (2006b) does (whose LF for (94a) would give Pi wide scope
relative to anyone). We dispense with the interpretations for than-clauses
that were attributed to wide scope of the Pi operator.
(94) a. Mary is taller than anyone else is.

b. Mary’s height exceeds the largest degree of tallness reached by
one of the others.
(940 ) [than anyone else is tall] = λD 0 . ∃x[x ≠ Mary & Height(x) ∈ D 0 ]
intervals into which the height of someone other than Mary falls
1:32
The shortest = maximally informative than-clause intervals will be the heights

of the other relevant people. (Thus we get rid of the intervals immediately.)
Out of these, we choose the maximum. This results in the same meaning as
under the classical analysis. Thus the same selection strategy that we used
above will predict the right truth conditions. The analysis extends to other
apparent narrow scope existentials like be allowed etc.
(95) _ _ _ _• _ _ _ _• _ _ _ _ •_ _ _ _ _ _ _ •
_ _ _ _ _ _ _/
x1 x2 x3 M
(9400 ) Mary is taller than Max> (m_inf(than-clause))

= Mary is taller than the height of the tallest other person.
The selection strategy predicts the right truth conditions for these ‘apparent
narrow scope’ and ‘apparent wide scope’ quantifier data without changing
scope. This allows us to predict ungrammaticality of negation straightfor-
wardly, as illustrated below.
3.1.3 Negation
Remember that the unacceptability of (96) could be understood in terms of an

undefined contribution of the than-clause (von Stechow 1984; Rullmann 1995).
The selection analysis presented here can retain this desirable prediction.
The meaning of the than-clause is (960 ), in accordance with what is said
above. This is the only meaning possible for the than-clause.
(96) *John is taller than no girl is.

(960 ) than no girl is tall = λD 0 . for no girl x : Height(x) ∈ D 0
intervals into which the height of no girl falls
(960 ) will not yield a well-defined meaning for the comparative. Just as in
the original analysis of these data, the than-clause will not provide us with a
maximum, since there is no largest interval containing no girl’s height. Max>
is undefined; hence negation in the than-clause leads to undefinedness of the
comparative as a whole. Since there is no other option, we no longer face
the problem of ruling out the apparent wide scope reading of the negative
quantifier.
The simple data discussed in this subsection highlight the potential attrac-
tion of the selection analysis. We keep a simple semantics for the comparative
1:33
Sigrid Beck
and don’t double interpretive possibilities with a scoping mechanism. Next,

we turn to all the complications.
3.2 Refinement I: Have to–type modals
This subsection concerns universal quantifiers that do not behave like every
girl, predict and other apparent wide scope universals. Remember from Sec-
tion 2 that modals like have to appear to favour a narrow scope interpretation
rather than the apparent wide scope interpretation described and derived
above for other universals.
(97) Mary wants to play basketball. The school rules require all players to
be at least 1.70 m.
(970 ) a. Mary is taller than she has to be.
b. Mary’s actual height exceeds the degree of tallness which she has
in all worlds compatible with the school rules;
Keeping stable our assumptions about the meaning of than-clauses, we

will assume (98) for this example. Selecting the maximum of the shortest
than-clause interval will not yield the desired truth conditions this time,
though: that would amount to the claim that Mary’s height exceeds the
maximum height permitted. The sentence intuitively says that Mary is above
the required minimum. Contrasts like the one between have to and predict
are of course what motivates the scope analysis (apparent wide scope for
predict, apparent narrow scope for have to). A different description of the
facts is that the example with predict (and similar examples with every girl,
should, etc.) has a ‘more than maximum’ interpretation while have to can
have a ‘more than minimum’ interpretation. I see the task for my approach
as having to explain how factors independent of comparative semantics may
result in a ‘more than minimum’ interpretation rather than the expected
‘more than maximum’ reading.
(98) than she has to be tall = λD 0 . ∀w[wR@ → Mary’s height in w ∈

D0 ]
intervals into which Mary’s height falls in all worlds compatible with
the rules
the beginning of this interval is below Mary’s actual height, i.e. Mary’s
height exceeds the minimal element of the shortest than-clause inter-
val
1:34
There are two analyses, as far as I am aware, that propose to reduce the
variation in the interpretation of than-clauses with universal modals between
maximum and minimum interpretation to independent factors, such that
the readings collapse into one. Meier (2002) proposes that the ordering
source that modal semantics uses is responsible for a contextually guided
determination of the interpretation, explaining away apparent maxima and
minima both. Krasikova (2008) examines the problem of have to–type modals
in comparatives in particular and employs covert exhaustification to explain
away apparent ‘more than minimum’ interpretations. While both approaches
solve the problem at hand equally well for my purposes, I describe below
Krasikova’s suggestions because they seem to me to offer more promise for
identifying which modal operators give rise to which reading(s).
Krasikova (2008) points out that whether we get a ‘more than minimum’
reading like the one illustrated above for this type of modal or a ‘more than
maximum’ reading parallel to the reading illustrated for predict depends on
the context an individual example is put into. Remember example (99) from
above, which shows that have to–type modals may also give rise to a ‘more
than maximum’ reading — the reading we expect under the present analysis.5
Thus what distinguishes have to–type modals from others is the availability
of an apparent narrow scope reading (a ‘more than minimum’ reading under
the present perspective).
(99) He was coming through later than he had to if he were going to retain
the overall lead. (from Google, cited from Krasikova 2008)
Krasikova further observes that the universal modals that can give rise to
the ‘more than minimum’/apparent narrow scope reading are just the ones
that occur in sufficiency modal constructions (SMC). An example of an SMC
is given below (von Fintel & Iatridou 2005).
(100) You only have to go to the North End (to get good cheese).
5 It is not at present clear to me under what circumstances a have to–type modal seems
to permit a more-than-maximum interpretation. Relevant factors may be the choice of a
negative polar adjective and a subjunctive-like interpretation (Danny Fox and Irene Heim,
p.c.). Personally, I find this interpretation very hard to get.
1:35
Sigrid Beck
(1000 ) Truth conditions: You do not have to do anything more dif-

ficult than to go to the North End (to get
good cheese).
Implicature: You have to go to the North End or do
something at least as difficult (to get good
cheese).
The combination of only and a modal in the SMC considers alternatives to the
proposition that is the complement of have to, and ranks those alternatives
on a scale. Plausible alternatives for our example and their ranking are given
in (101). They provide the domain of quantification, C in (102); (102a) sketches
a structure for the example, (102b) a meaning for ‘only have to’ and (102c)
the outcome, which corresponds to the desired truth conditions (1000 ). Note
that the SMC reading is one that identifies the point on a scale that is the
minimum sufficiency point, as illustrated in (103).
(101) a. that you go to the nearest supermarket, that you go to the North
End, that you go to New York, that you go to Italy
b. SUPER < NE < NY < Italy (where ‘<’ means: is easier than)
(102) a. [[only have to]C,< [you go to the North End]]
b. [only have to]C,< (p)(w) = 1 iff
∀q[q ∈ g(C)&¬(q < p) → ¬have to(q)(w)]
c. For all q such that q is in g(C) and ¬(q < NE) : ¬have to(q)
(103) _ _ _ _ _ _ _ _ _ _ _ _ • _ _ _ _ _ _ _ _ _ _ _ _/
necessary not necessary
NE
My sketch leaves unaddressed all the thorny problems of the SMC construc-
tion like the composition of only and have to with the rest of the clause,
and the problem of only’s presupposition; compare in particular von Fintel &
Iatridou 2005 and Krasikova & Zhechev 2006. What is important for present
purposes is Krasikova’s observation that the interpretation that have to–type
modals give rise to in than-clauses can be seen as an SMC interpretation. The
‘more than minimum’ interpretation just like the SMC identifies the point
on a scale that is the minimum sufficiency point. Whatever is a plausible
analysis of the SMC should be extendable to the problem at hand.
(104) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _/
necessary
• not necessary
1.70 m
1:36
Krasikova suggests that have to–type modals can use Fox’s (2007) covert
exhaustivity operator EXH instead of only, whose meanings are basically the
same. This is what happens in our comparatives, and this is responsible for
the ‘more than minimum’ interpretition.6 A structure for the than-clause
of (970 a) is given in (105a). Its interpretation using (102b) is (105b). Suppose
now that the relevant alternatives are the propositions in (106a), which place
Mary’s height in varying intervals. Our context is such that difficulties arise
with respect to reaching a certain height. Being short is not hard, being
tall is difficult. Thus the ordering of the alternatives in (106a) is one that
ranks them according to the height of the interval on the tallness scale into
which Mary’s height falls. The requirement easiest to meet is the minimal
compliance height. Given this, (105b) can be paraphrased as (105c).
(105) a. (than) [1 [[EXH has to]C,< Mary be t1 tall]]

b. [λD 0 . ∀q[q ∈ g(C) & ¬(q < λw. M’s height in w ∈ D 0 ) →
¬have to(q)]]
0
c. [λD . nothing more difficult is required than for Mary’s height
to fall within D 0 ]
(106) a. {λw. Mary’s height in w ∈ D1 , λw. Mary’s height in w ∈ D2 ,
λw. Mary’s height in w ∈ D3 , . . . }
b. If the ordering in terms of height is D1 < D2 < D3 . . . then:
λw. M’s height in w ∈ D1 < λw. M’s height in w ∈ D2 <
λw. M’s height in w ∈ D3 < . . .
(where ‘<’ means: is easier; in our context, being shorter is easier
than taller.)
Applying maximal informativity as usual yields the meaning below for the
subordinate clause, the minimum ‘point’ as desired. Selection with Max>
is trivial; the resulting meaning is that Mary’s actual height exceeds the
minimum compliance height.
6 As an anonymous reviewer points out, this raises the question of why we cannot have an
overt only in such sentences, cf. the ungrammaticality of (ia). The editors point out that
extraction of the associate of only is not good, cf. (ib). This would have to be different for
EXH than for only in order to answer the reviewer’s question.
(i) a. *Mary is taller than she only has to be.

b. *WhoF did Mary only _ call?
1:37
Sigrid Beck
(107) m_inf([λD. nothing more difficult is required than for Mary’s height
to fall within D])
= {the minimum compliance height}
= {[1.70–1.70]}
SMC readings of have to–type modals explain the ‘more than minimum’
reading that they can give rise to in comparative than-clauses with the single
assumption that EXH takes the place of only. Internal to the subordinate
clause, exhaustification occurs. Exhaustification of the than-clause reduces
the than-clause interval to a point. The ‘point’ that exhaustification yields is
the minimum compliance height.
I follow Krasikova in making the connection between SMC use and ‘more
than minimum’ readings and in her analysis in terms of exhaustification. This
allows me to maintain the selection analysis from the previous subsection.
According to this analysis, have to–type modals don’t require any revision of
the semantics of comparative constructions. We need to take into account
the special semantics of SMC modals instead. Contrary to appearances,
we uniformly select a degree from an interval via Max> ; with have to, we
may apply Max> after exhaustification. This gives rise to a ‘more than
minimum’/apparent narrow scope reading. If exhaustification does not apply,
we get the regular ‘more than maximum’ = apparent wide scope reading (cf.
example (99) above). Modals that do not permit an SMC reading do not permit
a ‘more than minimum’ reading either, because the ‘more than minimum’
reading is an SMC reading. I refer the reader to Krasikova 2008 for further
discussion. Crucially for present purposes the correlation with SMC use
provides an independent criterion for when to expect which reading. The
contrast between the different kinds of universal quantifiers is not analysed
as a scope effect. The analysis argued for here makes the interpetation of
have to–type modals a property of those particular lexical items. They are
the only apparent narrow scope items requiring special attention since in
contrast to the scope analysis’ procedure, apparent narrow scope existentials
have already been taken care of.
3.3 Refinement II: Indefinites, numeral NPs and the like
This section concerns existential quantifiers that do not behave like NPI any
and other apparent narrow scope existentials. The problem for the selection
1:38
strategy can be illustrated by the example below.
(108) John is taller than exactly five of his classmates are.

(1080 ) a. Exactly five of John’s classmates are shorter than he is.
b. #John is taller than the tallest of his 5 or more classmates.
The intuitively available interpretation (1080 a) looks once more like a straight-
forward wide scope reading of the numeral quantifier. Application of the
selection strategy predicts an interpretation that is unavailable, (1080 b), as
illustrated below.
(109) λD 0 . for exactly 5 x : max(λd. x is d-tall) ∈ D 0

intervals into which the height of exactly 5 classmates falls
Max> (m_inf([λD 0 . for exactly 5 x : max(λd. x is d-tall) ∈ D 0 ])) =
the height of John’s tallest classmate, as long as there are at least 5
_ _ •_ _ _•_ _ _• _ _ _ _ •_ _ _•_ _ _• _ _ _• _ _ •_ _ _ _ _ _/
c1 c2 c3 c4 c5 c6 c7 c8

Max>
We face the combined challenge of (i) predicting the right interpretation and
(ii) not predicting the non-existing one. I propose to tackle this problem
through a more thorough analysis of numeral NPs. We will first consider
indefinite NPs in the context of than-clauses and then move on to numerals
and example (108).
3.3.1 Singular and plural indefinites
Singular indefinites allow in principle two interpretations in than-clauses: an

apparent wide scope and an apparent narrow scope reading. Which reading(s)
is/are possible depends on the indefinite as well as the sentence context.
We have seen examples with NPIs in which only the narrow scope reading is
available. An example that has a wide scope reading is given in (110). (111)
and (112) provide two examples which I take to be genuinely ambiguous (the
English version of (111) probably is too, although native speakers seem to
have some difficulty judging the example).
(110) a. John is taller than one of the girls is.

b. There is a girl x such that John is taller than x.
1:39
Sigrid Beck
(111) Annett hat lauter gesungen als eine Sopranistin.

Annett has louder sung than a soprano
‘Annett sang more loudly than a soprano did.’ (German)
(1110 ) a. There is a soprano x such that Annett sang more loudly than x.
b. Annett sang more loudly than any soprano did.
(112) Sveta could solve this problem faster than some undergrad could.
(1120 ) a. There is an undergrad x such that Sveta could solve this problem
faster than x could.
b. Sveta could solve this problem faster than any undergrad could.
For examples with apparent narrow scope existentials it was demonstrated

above (with an NPI indefinite, anyone else) how the selection analysis can
derive an appropriate interpretation corresponding to the apparent narrow
scope reading. What about the apparent wide scope reading? One option
open to us is to acknowledge that indefinites quite often give rise to apparent
wide scope readings — so-called specific readings — and to adopt whatever
mechanism is appropriate for the analysis of specific readings in general for
apparent wide scope indefinites in than-clauses. This is what I will do, and I
use the choice function mechanism as the probably best known analysis of
specific indefinites (e.g. Reinhart 1992; Kratzer 1998; but see Endriss 2009 for
a different analysis). I illustrate with example (113a) from Heim 1982, where a
friend of mine can have apparent scope over the conditional.
(113) a. If a cat likes a friend of mine, I always give it to him.

There is a friend of mine such that if a cat likes him, I give it to
him.
b. ∃f : CH(f ) & [if a cat likes f(friend of mine), I give it to him]
If a cat likes the friend of mine selected by f (f a choice function),
I give it to him.
Furthermore, I will assume that indefinite NPs, e.g. with German ein (‘a’),
are ambiguous between the ‘normal’ interpretation ‘∃x’ (existential quan-
tification over individuals) and the ‘specific’ interpretation ‘∃f ’ (existential
quantification over choice functions). Below I provide a selection analysis
of the two readings of (111) under those assumptions.7 On this analysis,
the apparent narrow scope reading amounts to a ‘∃x’ interpretation and
7 I use the German example because the larger English inventory of indefinites makes it hard
for me to determine which examples are genuinely ambiguous.
1:40
the apparent wide scope reading amounts to a ‘∃f ’ interpretation for the
indefinite.
(114) a. [als [1 [einex Sopranistin t1 laut gesungen hat]]]

= [λD 0 . ∃x[soprano(x) & max(λd. x sang d-loudly) ∈ D 0 ]]
intervals that cover the loudness of soprano singers
Annett sang more loudly than

Max> (m_inf([λD 0 . ∃x[soprano(x)&max(λd. x sang d-loudly) ∈
D 0 ]]))
= Annett sang more loudly than the loudest soprano.
= Annett sang more loudly than any soprano did.
b. [als [1 [einef Sopranistin t1 laut gesungen hat]]]
= [λD 0 . max(λd. f (soprano)sang d-loudly) ∈ D 0 ]
intervals that include the loudness of the soprano selected by f
∃f : CH(f ) & Annett sang more loudly than

Max> (m_inf([λD 0 . max(λd. f (soprano) sang d-loudly) ∈ D 0 ]))
= Annett sang more loudly than the soprano selected by f (f a
choice function).
= There is a soprano x such that Annett sang more loudly than
x.
I further assume that the usual factors (in particular, the nature of the
indefinite and what readings the sentence context permits) decide when we
can get which reading(s) of a singular indefinite. I have nothing illuminating
to say about the particulars of this; note, however, that I do assume that
apparent narrow scope readings are possible with indefinites/existentials
other than NPIs. My intuitions regarding German indefinites like jemand
(someone) + anders/sonst (other/else), wh-word + other/else convince me
of this in particular, because these indefinites are not, I believe, plausibly
analysed as polarity items, nor are they plausibly analysed as generic (hence
not existential). Other languages’ inventory of indefinites may make my view
of what the interpretive possibilities of existentials in than-clauses are appear
less obvious. I am grateful in particular to Sveta Krasikova for discussion of
this point.
1:41
Sigrid Beck
(115) a. Hier ist es schöner als anderswo.

here is it nicer than elsewhere
‘It’s nicer here than it is elsewhere.’
b. possible reading:
It is nicer here than it is anywhere else.
(116) a. Sam ist schneller als jemand anderes/sonstwer.
Sam is faster than someone other/someone else
‘Sam is faster than another person.’
b. possible reading:
Sam is faster than anyone else is.
Also, the data in (117) (in addition to (111) above) provide an indefinite, ein
anderer (‘another’), that is ambiguous. Both (117a) and (117b) were collected
informally from the web. Context makes it clear that (117a) is intended to
mean ‘faster than everyone else’ and (117b) is intended to mean that someone
was slower.
(117) a. Wir denken 7-mal schneller, als ein anderer reden kann.
we think 7 times faster than an other talk can
‘We think seven times faster than anyone else can talk.’
b. Die meisten überholten mich, aber ab und zu war ich auch
the most passed me but now and then was I also
mal schneller als ein anderer.
once faster than an other
‘Most people passed me, but now and then I was faster than
someone.’
Matters look somewhat different when we consider plural indefinites. Begin-

ning with bare plurals, note that many examples sound strange (thank you to
Irene Heim for example (118)).
(118) a. John is taller than a giraffe.

b. ??John is taller than giraffes.
(119) a. Prof. Shimoyama hat einen längeren Beitrag geschrieben

Prof. Shimoyama has a longer contribution written
als eine Doktorandin.
than a Ph.D. student
‘Prof. Shimoyama wrote a longer contribution than a Ph.D. stu-
dent.’
(ok: ∃x, ok: ∃f )
1:42
b. ??Prof. Shimoyama hat einen längeren Beitrag geschrieben

Prof. Shimoyama has a longer contribution written
als Doktorandinnen.
than Ph.D. students
‘Prof. Shimoyama wrote a longer contribution than Ph.D. stu-
dents.’
(120) a. Hans ist schneller gelaufen als eine Schwester von Greg.
Hans ran faster than a sister of Greg’s. (ok: ∃x, ok: ∃f )
b. ??Hans ist schneller gelaufen als Schwestern von Greg.
Hans ran faster than sisters of Greg’s.
c. Hans ist schneller gelaufen als einige Schwestern von Greg.
Hans ran faster than several sisters of Greg’s. (ok: ∃f )
The version with the singular indefinite can have an apparent narrow scope or
an apparent wide scope interpretation (with some speaker variation regarding
which interpretation is favoured). It is known that bare plurals prefer narrow
scope interpretations — let’s say this implies that the choice function ‘∃f ’
interpretation is dispreferred. What the oddness of the plural data tells us,
then, is that there is something unexpectedly wrong with the non-specific ‘∃X’
interpretation of the plural indefinite (I write capital ‘X’ to indicate plurality,
in contrast to ‘x’ for singular). Note that the data (118)–(120) improve when
some or several/einige is added to the plural indefinite. They then have an
apparent wide scope or ‘∃f ’ interpretation. The following generalization
emerges:
(121) Max> (m_inf(λD.∃X[. . . ])) is dispreferred relative to

Max> (m_inf(λD.∃x[. . . ])).
A plural indefinite ambiguous between ‘∃X’ and ‘∃f ’ will yield ‘∃f ’.
A plural indefinite that prefers the ‘∃X’ interpretation will sound
strange.
Why should a plural indefinite sound odd unless it can easily reveice a
specific interpretation? The generalization is intuitively unsurprising once
we examine the ‘∃X’ interpretation more closely. Careful consideration as to
what it would mean in the case of (120), provided in (122a), reveals that (given
that there is more than one sister of Greg’s) it would be true iff the sentence
with the singular ‘∃x’ (’any sister of Greg’s) would be true. I suggest that
this makes the interpretation (122a) somehow inappropriate for the example.
Perhaps this can be seen as a matter of economy: the plural has no purpose,
1:43
Sigrid Beck
hence cannot be used gratuitously.
(122) a. #Hans ran faster than

Max> (m_inf([λD 0 . ∃X[∗sister(X) & ∀x ∈ X : x’s speed ∈ D 0 ]))
= Hans ran faster than any sister of Greg’s.
b. ∃f : CH(f ) & Hans ran faster than
Max> (m_inf([λD 0 . ∀x ∈ f (∗sister) : max(λd. x ran d-fast) ∈
D 0 ]))
= Hans ran faster than each of the sisters selected by f (f a choice
function).
(dispreferred with bare plural, ok with some/several)
(123) is a first shot at what the relevant constraint might effect. The reading
that survives, (122b), is one in which, compared to the corresponding singular
indefinite, the plural serves a purpose.
(123) Ban on Unmotivated Pluralization (BUMP):

Do not quantify over a plurality if quantification over a singularity
lets you infer the same reference.
It would be good to be able to reduce this phenomenon to other cases with a

similar semantics.8 Below I relate than-clauses to definite descriptions and
embedded questions (I am once more inspired by Danny Fox (p.c.) in making
this connection). The idea is that all three constructions share some sense of
maximality and/or maximal informativity (Fox & Hackl 2006 and the above
considerations). So (124a) refers to the maximal, and in the sense of (85)
above, the maximally informative speed that John ran; (124b) will require
the maximally informative answer, i.e. the maximal speed John reached; and
according to the analysis developed here, (124c) is of course analogous.
(124) a. the speed that John ran

b. how fast John ran
c. than John ran
8 An anonymous reviewer and Danny Fox pointed out to me that a plural is not generally
dispreferred when a singular yields the same interpretation, contrary to a claim I made in an
earlier version of this paper. Negation and other downward monotone environments allow
plural indefinites, as the example in (i) illustrates. I thank them for pointing out this flaw to
me.
(i) We don’t sell apples (??an apple) in this store.

There were no women present.
1:44
The following three sets of data replace the proper name in (124) with various
kinds of indefinites in the three constructions. The plain singular indefinite
is fine and picks out the fastest speed in the definite description and the
question as well as in the than-clause — in addition to a possible specific
reading. The bare plurals are somewhat odd, which we can explain if a
constraint like the BUMP above is operative (and the ‘∃f ’ interpretation is
dispreferred). The last set with plural some indefinites are fine and have the
specific reading. Plural indefinites with some are different from bare plurals
in easily allowing an ‘∃f ’ interpretation.
(125) a. the speed that a sister of Greg’s ran

b. how fast a sister of Greg’s ran
c. than a sister of Greg’s ran
(126) a. ??the speed that sisters of Greg’s ran
b. ??how fast sisters of Greg’s ran
c. ??than sisters of Greg’s ran
(127) a. the speed that some sisters of Greg’s ran
b. how fast some sisters of Greg’s ran
c. than some sisters of Greg’s ran
These data share the problem of having to determine unique reference from
a set via maximality/informativity. They motivate the way that the BUMP
is phrased above. Perhaps it is the nature of maximality/informativity as
‘glue’ that makes it sensitive to such a constraint: the step of postulating
such operators is an inference one draws to have things make sense, and
such inferences are subject to ‘making sense’-type of requirements like the
BUMP. But I hasten to add that I am by no means confident that I understand
what is at stake and that more work ought to be done in figuring out what
the BUMP is really about.
I conclude this subsection with a couple of comments on further kinds
of indefinites. The first data point confirms the perspective on the data
developed so far with the German example (128), where the obligatorily
weak lauter (several/many) sounds very strange. Only einige (several) is
acceptable, under an apparent wide scope reading.
(128) Annett hat lauter gesungen als einige/??lauter Sopranistinnen.

Annett has louder sung than several sopranos
‘Annett sang more loudly than several sopranos.’
1:45
Sigrid Beck
This can be understood if lauter disprefers a choice function analysis, permit-

ting only the BUMP violating reading (1280 a), while einige yields an acceptable
interpretation in terms of (1280 b). Our assumption about lauter vs. einige
is confirmed by (129), where only the version with einige allows the specific
interpretation of the NP ‘relatives of mine’.
(1280 ) a. #Annett sang more loudly than

Max> (m_inf([λD 0 . ∃X[∗soprano(X) &
∀x ∈ X : max(λd. x sang d-loudly) ∈ D 0 ]))
= Annett sang more loudly than any soprano.
b. ∃f : CH(f ) & Annett sang more loudly than
Max> (m_inf([λD 0 . ∀x ∈ f (∗soprano) : max(λd. x sang d-loudly) ∈
D 0 ]))
= Annett sang more loudly than each of the sopranos selected
by f (f a choice function)
(129) a. Wenn einige Verwandte von mir sterben, erbe ich einen Bauern-
hof.
b. Wenn lauter Verwandte von mir sterben, erbe ich einen Bauern-
hof.
‘If several relatives of mine die, I will inherit a farm.’
Similarly, we might expect that NPIs in than-clauses will only be licensed on

the apparent narrow scope reading ‘∃x’ (perhaps they have no ‘∃f ’ inter-
pretation, or perhaps that interpretation would fail to satisfy the licensing
requirements on their context). This predicts that singular NPIs only have
an apparent narrow scope reading. It also makes the interesting prediction
that plural NPIs should be odd in than-clauses. (130b) is judged degraded
compared to (130a) and (130c) by some speakers, but not by all.
(130) a. John solved this problem faster than any girl did.
b. ??John solved this problem faster than any girls did.
c. John solved this problem faster than any of the girls did.
I don’t understand why some people judge (130b) to be fine; I wonder whether
a Free Choice interpretation of any girls is possible for those who accept the
sentence.
A final remark: it is not the case that plural indefinites in than-clauses
are generally bad, not even narrow scope ones. The data in (131) embed the
indefinite beneath another operator, and the BUMP does not apply.
1:46
(131) a. More people bought books than read magazines.

b. I buy books more often than I buy magazines.
To sum up: indefinites are semantically ambiguous, and this shows up in

than-clauses just like it does elsewhere. Apparent wide scope of indefinites is
analysed as pseudoscope: a specific reading. Sometimes one interpretation is
excluded by independent factors. In particular an economy constraint BUMP
can rule out ‘∃X’ for plural indefinites in than-clauses.9 The analysis rests on
how the semantic glue interacts with intervals, and on how the interpretation
is derived. I assume that the semantic glue is sensitive to BUMPy constraints,
i.e. that it is a natural place for their application.
3.3.2 Numerals
With these results regarding indefinites in place, let us next be somewhat

more precise in our semantic analysis of ‘exactly n’. Like Gajewski (2008), we
employ a more elaborate analysis of these numerals (compare Hackl 2001a,b;
9 It is not clear to me that competing analyses of quantifiers in than-clauses can easily explain
the pattern of singular vs. plural indefinites. To give an example, the Pi analysis (supposing
it goes along with my assumptions about the semantics of plural indefinites) predicts for (ia)
a narrow scope reading (ic) in addition to the wide scope reading (ib).
(i) a. John was faster than (some) sisters of Greg’s were.

b. ∃X[∗ sister(X) & ∀x ∈ X : Speed(John) > Speed(x)]
‘Some sisters of Greg’s were slower than John.’
c. Speed(John)] > max(λd. ∃X[∗ sister(X) & ∀x ∈ X : Speed(x) ≥ d])
‘John’s speed exceeds the speed reached by the slowest member of a plurality of
sisters of Greg’s’
= John was faster than the second fastest sister of Greg’s.
d. ∃d[Speed(John) ≥ d & NOT ∃X[∗ sister(X) & ∀x ∈ X : Speed(x) ≥ d]]
e. Suppose Greg has three sisters:
_ _•_ _ _ _ _ _ _ _ •_ _ _ _ _ _ _ _ •_ _ _ _ _ _ _/
x1 x2 x3
largest speed reached by

every member of a plurality
of sisters of Greg’s
An interpretation corresponding to (ic) is not available and would have to be excluded — in

the plural case, but not in the singular. The reading predicted by the NOT-theory, (id), is
parallel. Depending on how hard it is to do so, an argument might be gained for the selection
analysis from the pattern of singular vs. plural indefinites in than-clauses.
1:47
Sigrid Beck
Krifka 1999 on the semantics of such NPs). Remember the simple example
(66) and its analysis.
(66) a. Exactly three girls weigh 50 lb.

b. [EXACT [XP (exactly) threeF girls weigh 50 lb.]]
(660 ) threeF girls weigh 50 lb.o = ∃X[∗ girl(X)&card(X) = 3&∗ weigh. 50. lb(X)]
threeF girls weigh 50 lb.f =
{∃X[∗ girl(X) & card(X) = n & ∗ weigh. 50. lb(X)] : n ∈ N}
(67) EXACT(XPf )(XPo ) = 1
iffXPo = 1 & ∀q ∈ XPf : ¬(XPo → q) → ¬q
‘Out of all the alternatives of XP, the most informative true one is the
ordinary semantics of XP.’
(68) (66b) = 1 iff
∃X[∗ girl(X) & card(X) = 3 & ∗ weigh. 50. lb(X)] &
∀n[n > 3 → ¬∃X[∗ girl(X) & card(X) = n & ∗ weigh. 50. lb(X)]] iff
max(λn. ∃X[∗ girl(X) & card(X) = n & ∗ weigh. 50. lb(X)]) = 3
This step does not immediately solve our problem. If we give the than-clause
in (108) the semantics in (132), nothing changes: we still compare with the
tallest of John’s classmates, as long as there are at least five. Notice, however,
that this interpretation is just as strange as the plain plural indefinite ‘∃X’
interpretation above, since the number information serves no real purpose
for the truth conditions.
(108) John is taller than exactly five classmates of his are.

(132) λD 0 . max(λn. ∃X[∗ classmate(X)&card(X) = n&∗ Height(X) ∈ D 0 ]) =
5
Intervals into which the height of exactly five of John’s classmantes
falls
(133) John is taller than
Max> (m_inf(λD 0 . max(λn.∃X[∗ classmate(X) & card(X) = n &
∗ Height(X) ∈ D 0 ]) = 5))
(1330 ) Presupposition: John has at least five classmates.
Assertion: He is taller than any of them.
This reading is thus ruled out by the same constraint BUMP. We should then
alternatively consider a choice function analysis of the indefinite ‘n class-
1:48
mates’. I combine this below with the assumption that exactly is evaluated in
the matrix clause. In (134), we derive the desired interpretation.
(134) max(λn. ∃f [CH(f ) & John is taller than

Max> (m_inf(λD 0 . ∀x ∈ f ((λX. ∗ classmate(X) & card(X) = n) :
Height(x) ∈ D 0 ]) = 5
’the largest number n such that John is taller than the tallest of the
n classmates of his selected by some choice function f is 5.’
An LF of example (108) representing a version of Krifka’s analysis looks as in

(135).
(135) a. [EXACT [John is taller [than Max> m_inf [(exactly) 5f of his class-
mates are tall]]]]
b. Out of all the alternatives of the form ‘John is taller than n of his
classmates are’, the most informative true one is ‘John is taller
than 5 of his classmates are’.
The applicability of the constraint BUMP to numeral indefinites is empiri-

cally supported by the data below, which behave in a parallel way to plural
indefinites with some, for example.
(136) a. the speed that two finalist drove

b. how fast two finalist drove
c. than two finalist drove
Thus I suggest that a proper semantic analysis of numeral NPs makes the
facts compatible with a selection solution after all.
3.3.3 Further relevant cases
The analysis developed here for indefinite NPs in than-clauses needs to be

extended to NPs with many and most, which show the same apparent wide
scope interpretations we observed for numerals.
(137) a. John is taller than many of his classmates are.

b. There are many classmates of John’s such that he is taller than
they are.
(138) a. John is taller than most of his classmates are.
b. For most x, x a classmate of John’s: John is taller than x.
1:49
Sigrid Beck
I will make further use of the semantics developed by Hackl (2001a,b, 2009)
for these NPs, according to which ‘many N’ is an indefinite NP including a
gradable adjective in the positive form, and ‘most N’ is correspondingly a
superlative. This makes feasible analyses that can be paraphrased in the
following way:10
(1370 ) John is taller than the tallest of the many-membered group of class-
mates of his selected by f (f a choice function).
(1380 ) John is taller than the tallest of the group selected by f , which
comprises a majority of his classmates (f a choice function).
More detailed analysis are given below ((139) provides the two potential
readings of (137) and (140)–(142) analyse (138)). Besides being able to predict
the existing readings, the BUMP constraint in (123) will rule out the ones that
are intuitively unavailable.
(139) a. #John is taller than

Max> (m_inf([λD 0 . ∃X[∗ classm(X) & many(X) & ∀x ∈ X :
Height(x) ∈ D 0 ]))
= John is taller than any classmate (as long as there are many).
b. ∃f : CH(f ) & John is taller than
Max> (m_inf([λD 0 . ∀x ∈ f (λX. ∗ classm(X) & many(X)) :
Height(x) ∈ D 0 ]))
= John is taller than each of the many classmates selected by f
(f a choice function)
(140) than [1 [X most of his classmates are t1 tall]] =
[λD 0 . ∃X∃d[∗ classm(X)&d-many(X)&∀Y ∈ C[Y ≠ X&∗ classm(Y ) →
¬d-many(Y )] & ∀x ∈ X : Height(x) ∈ D 0 ]
intervals that contain the heights of a majority of John’s classmates
(141) than [1 [f most of his classmates are t1 tall]] =
[λD 0 . ∀x ∈ f (λX. ∃d[∗ classm(X) & d-many(X) & ∀Y ∈ C[Y ≠
10 An anonymous reviewer points out that this predicts that these NPs can have the same
specific readings we know from indefinites. I concur, but would like to point out that this
prediction arises from an analysis of these quantifiers as indefinites, not from the application
of that analysis to than-clauses. The empirical test cases include data like (i) below.
(i) a. If many relatives of mine die, I will inherit a farm.

b. If most relatives of mine die, I will inherit a farm.
1:50
X & ∗ classm(Y ) → ¬d-many(Y )]) : Height(x) ∈ D 0 ]

intervals that contain the heights of the majority of John’s class-
mates selected by f
(142) a. #John is taller than
Max> (m_inf([λD 0 . ∃X∃d[∗ classm(X)&d-many(X)&∀Y ∈ C[Y ≠
X & ∗ classm(Y ) → ¬d-many(Y )] & ∀x ∈ X : Height(x) ∈ D 0 ]))
= John is taller than the tallest of any majority of his classmates.
= John is taller than any of his classmates.
b. ∃f : CH(f ) & John is taller than
Max> (m_inf([λD 0 . ∀x ∈ f (λX. ∃d[∗ classm(X) & d-many(X) &
∀Y ∈ C[Y ≠ X & ∗ classm(Y ) → ¬d-many(Y )]) : Height(x) ∈
D0 ]
= John is taller than the tallest of the majority of John’s class-
mates selected by f (f a choice function)
= For most x, x a classmate of John’s: John is taller than x.
To sum up: this subsection has analysed the available vs. unavailable readings
of indefinite NPs in than-clauses using a choice function mechanism plus a
constraint on unmotivated pluralization. The formulation of the BUMP in
(123) is offered as a first version of the constraint we need; what we want to
derive is that it is strange to say ‘John is taller than exactly three girls are’ if
we meant, and might as well have said ‘John is taller than any girl is’. Since
this seems eminently reasonable, I am hopeful that a good way of stating
the relevant constraint exists. Given this, the present section has extended
the selection analysis to apparent wide scope indefinite NPs of various kinds
(including numerals, many and most), using a pseudoscope mechanism
argued for extensively for indefinites independently of comparatives. The
comparative semantics itself remains simple.
3.4 Refinement III: Differentials
The final kind of data that does not immediately fall out from the selection
analysis is represented by example (143) below: a than-clause containing a
universal quantifier in combination with a differential.
(143) a. John is exactly 200 taller than every girl is.

b. For every girl x: John is exactly 200 taller than x.
1:51
Sigrid Beck
Compared to Heim, and also Schwarzschild & Wilkinson, we seem to have

a problem. Heim’s analysis can derive the intuitive interpretation as shown
below.
(144) [[than every girl is tall] [5 [John is exactly 200 taller t5 ]]]
(1440 ) [than [1 [every girl [2 [[Pi t1 ] [3 [t2 is t3 tall]]]]]]] =
λD 0 . ∀x[girl(x) → Height(x) ∈ D 0 ]
(145) (144) = [λD 0 . ∀x[girl(x) → Height(x) ∈ D 0 ](λd. John is exactly 200 taller than d)
= for every girl x: John is exactly 200 taller than x
Choice of Max> on the other hand predicts a different interpretation, which

does not seem right for (143):
(146) John is exactly 200 taller than Max> (m_inf(than-clause))

= John is exactly 200 taller than the tallest girl.
The intuitively available reading of (143a) can be described as one in which

we assume that all the girls reach the same height. I call this an assumption
of equality among the individuals universally quantified over, EQ for short.
The EQ appears to speak in favor of a scope solution since it is entailed
by the truth conditions resulting from giving the universal wide scope over
the comparison. It is not entailed by the truth conditions according to
the selection analysis, although it is of course compatible with the truth
conditions in (146) that the girls all have the same height.
Sentence (143a) exemplifies a problem that arises when a than-clause
containing a universal quantifier is combined with a differential that in-
cludes exactly, at most or almost. A differential including at least does not
distinguish between the two sets of truth conditions.
(147) a. John is at most/almost 200 taller than every girl is.

b. For every girl x: John is no more than 200 taller than x
c. #John is no more than 200 taller than the tallest girl.
(148) a. John is at least 200 taller than every girl is.
b. For every girl x: John is at least 200 taller than x
c. John is at least 200 taller than the tallest girl.
An unmodified differential does not constitute evidence as strong as an

exactly/at most-type differential, because, while it gives rise to the usual
1:52
strengthening implicature that amounts to an exactly reading, this impli-

cature can be canceled. If we suppose the implicature to be present, the
unmodified differential is parallel to exactly.
(149) a. John is 200 taller than every girl is.

b. Implicature: John is no more than 200 taller than every girl is.
c. John is 200 taller than every girl is, perhaps more.
To sum up the picture so far, differentials with exactly and at most, and
perhaps simple differentials, seem to be problematic for the selection analysis
as opposed to the scope analysis.
However, there is more to say about this issue empirically and theoreti-
cally. Beginning with the theoretical side, note that the interpretation of the
matrix clause in (144) was simplified in terms of not giving the differential
quantifier exactly 200 independent scope.11 Data like (150) show that such
expressions do take scope, however:
(150) You are allowed to be exactly 60 tall.

(151) exactly 60 = λD. max(D) = 60
(1500 ) a. max(λd. ∃w[wAcc@& you are d-tall in w]) = 60
The largest permitted height for you is 60 .
b. ∃w[wAcc@ & max(λd. you are d-tall in w) = 60 ]
It is permitted that you be exactly 60 tall.
Hence, in addition to (a more elaborate version of) (144) above, the LF and
interpretation in (152) become possible. For the Pi theory, this leads to
availability of the analysis in (153).
(152) [[exactly 200 ] [4 [[than every girl is tall] [5 [John is t4 taller t5 ]]]]]
(1440 ) [than [1 [every girl [2 [[Pi t1 ] [3 [t2 is t3 tall]]]]]]] = λD 0 . ∀x[girl(x) →
Height(x) ∈ D 0 ]
(153) (152) = [exactly 200 ](λd0 . [λD 0 . ∀x[girl(x) → Height(x) ∈ D 0 ]
(λd. John is d0 taller than d)
= [exactly 200 ](λd0 . for every girl x: John is d0 taller than x)
= max(λd0 . for every girl x: John is d0 taller than x) = 200
‘The largest amount that John is taller than every girl is 200 .’
11 Thanks to Danny Fox for drawing my attention to this point.
1:53
Sigrid Beck
Note that this LF no longer predicts all the girls to have the same height. It
says that John is exactly 200 taller than the tallest girl — just like the selection
analysis. It is thus not clear that the predictions of the scope analysis are
really different from, and superior to, the selection analysis.
Next, let’s take a closer look at the data. Above, we identified as a problem
that EQ is not predicted, the assumption that all individuals universally
quantified over have the same height (or whatever the gradable predicate
measures). However, the data are quite difficult. While I agree with the
perception in the literature that in (143a) the EQ is plausible, it is clear that it
does not always arise. Below are some examples where it doesn’t; (154)–(156)
are collected from the internet.12 The reader can convince her/himself that
further relevant data can easily be found. The difficulty in determining the
interpretation of data with nominal universal quantifiers is related to the
point mentioned in Section 2 about differentials and intensional verbs. I
mention in (1560 ) a suggestive example also collected from the web.
(154) Aden had the camera for $100 less than everyone else in town was
charging.
(155) WOW! Almost 4 seconds faster than everyone else, and a 9 second
gap on Lance.
(156) Jones was almost an inch taller than the both of them. (the both
of them = John Lennon and Paul McCartney, Jones = Tom Jones.
The author thinks that Jones was 50 1100 and that Paul McCartney was
about 50 1000 . John Lennon is reported to be shorter than McCartney
by about an inch.)
(1560 ) I finished 30 seconds faster than I expected. [. . . ] I know my 300
yard time more accurately now.
(the continuation suggests that the speaker’s expectation was a
range rather than a precise point in time.)
The examples are straightforwardly analysed using Max> to determine the

relevant ‘point’ provided by the than-clause.13 The differential measures the
12 A naive Google search has not unearthed a clearly relevant example with an exactly-
differential.
13 A different type of example illustrated below is difficult for both a scope and a selection
analysis. I find it hard to decide what such examples mean precisely. It seems plausible to
me that we select some kind of ‘point’ from the meaning of the than-clause, but not in the
way described in the text.
1:54
distance between that and the main clause degree. This is demonstrated for
(155) below.
(1550 ) a. #For all x, x ≠ Z: (Z was) almost 4 seconds faster than x (wide

scope)
b. (Z was) almost 4 seconds faster than Max> (m_inf(λD 0 . for all x ≠
Z : Speed(x) ∈ D 0 ))
= Z was almost 4 seconds faster than the next fastest person.
(selection Max> )
We face the task of figuring out what distinguishes (143) from (154)–(156),
i.e. why EQ arises in some data but not all. I would like to ask this question
in terms of how the selection analysis might predict not only (154)–(156),
but also (143). To this effect, let’s take a closer look at the combination of a
differential with a comparative.
Note that we understand a claim like (157a) relative to a plausible level of
granularity. For us to judge (157a) to be true, it is in most contexts sufficient
to be precise up to the level of a few millimeters. Suppose on the other hand
that (157b) is about a sensitive piece of machinery. A one millimeter margin
could very well not be acceptable. This means that what we call John’s height,
or that rod’s length, is actually somewhat fuzzy: it is a ‘blob’ or an interval
on the relevant scale whose size depends on context. The sensitivity to a
level of precision is not represented in the standard truth conditions of the
two examples given in (1570 ).
(157) a. Mary is exactly 2 cm taller than John is.

b. This rod is exactly 2 cm longer than that rod is.
(1570 ) a. Height(Mary) = Height(John) + 2 cm
b. Length(this rod) = Length(that rod) + 2 cm
To capture this, I follow Krifka (2007) in assuming that a scale can be divided
into different units. A unit on the scale then has to be identified that can
count as a ‘point’ at the contextually relevant level of granularity. Which
(i) a. Ben was almost a year older than everyone else in his class (because he had just
missed the deadline for the previous school year).
b. #For all x ≠ Ben: Ben was almost a year older than x.
c. #Ben was almost a year older than the next oldest in his class.
d. ?The others’ ages center around a point almost a year younger than Ben.
1:55
Sigrid Beck
division we assume depends on context. Talking about a length of 1.80 m for

example could then refer to a very short or a somewhat larger stretch of the
scale, depending on the relevant standard of precision/unit size. I talk about
unit size as granularity.
(158) . . . _ _ _ _ •_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _/
1.80 m
... ...
5 cm 5 cm
... ...
... 2 cm
2 cm
...
1 cm 1 cm
I make use of Schwarzschild’s (1996) notion of a cover as a division of an

entity into its contextually relevant parts, and apply it to scales in (159).
Covers provide the relevant granularity.
(159) Let hS, >i be a scale. Then Cov is a cover of S if Cov is a set of
subsets of S such that each d in S is in some set in Cov, each set in
Cov is contiguous and no two sets in Cov overlap. Assume Cov to
be the set of intervals that are of the contextually relevant size.
I furthermore revise the definition of an end “point” from (160) to (161) ((161b)
is the informal version, (161c) the more precise version employing covers).
Note that the distinction between points and intervals dissolves under this
view because what we usually call a point is an interval on the scale whose
size depends on context.
(160) a. Max> (phhd,ti,ti ) : = max> (max> (p))

= the end point of the interval that extends furthest
b. Let S be a set ordered by R. Then maxR (S) = ιs[s ∈ S & ∀s 0 ∈
S[sRs 0 ]]
(161) a. Max> (phhd,ti,ti ) := end> (max> (p))
= the end ‘blob’ of the interval that extends furthest
b. end> (D) := ιd. d ⊆ D & ¬∃d0 [d0 ⊆ D & d0 > d] & d counts as a
point at the relevant level of granularity
c. Let Cov be the set of intervals that are of the contextually relevant
size.
end>,Cov (D) := ιd. d ⊆ D & d ∈ Cov &
¬∃d0 [d0 ⊆ D & d ≠ d0 & d0 ∈ Cov &d0 > d]
1:56
Supposing that we talk about what we roughly call 1.80 m, the meanings of
our two than-clauses could (depending on context, i.e. the relevant cover)
come out as in (162). It is thus in the nature of scales that they have a
part/whole structure whose units are determined in a context dependent
manner.
(162) a. Max>,Cov1 (than John is tall) = [1.798–1.803] (a 0.5 cm unit)

b. Max>,Cov2 (than that rod is long) = [1.7998–1.8002] (a 0.4 mm
unit)
Let’s consider differentials under this refined understanding of scales. A

differential measures the distance from the “point” referred to in the matrix
to the “point” referred to in the than-clause, “point” being determined by
the relevant unit size. Note that a plausible granularity for the than-clause
has to match the granularity level suggested by the differential. If the two
do not match, an odd sentence results. I call this a granularity clash. In the
example below, we know that it is impossible to determine to the second the
amount of time that it took John to learn French. The than-clause comes
inherently with a coarse granularity, which clashes with the granularity of
the differential in (163b).
(163) a. Mary learned arithmetic faster than John learned French.

b. ?Mary learned arithmetic faster than John learned French by 7
minutes 23 seconds.
c. Mary learned arithmetic faster than John learned French by
several months.
We can generalize from the example as follows. In a comparative of the form

(164a), it must at least be given that the cover of the relevant interval that
the than-clause provides (via informativity) furnishes units that are smaller
than the differential; i.e. (164b) is a requirement for the comparative to make
sense. If that is the case, then the unit picked out as a “point” by Max>
will also be smaller than the differential (164c). The comparative can then
measure the gap between the main clause degree and the maximum of the
than-clause with the differential ((164d)). If the maximum itself is larger, this
will be impossible. In our example, suppose that we can with exceptional
precision determine to the day how long it took Mary to learn arithmetic
and John to learn French. We cannot reasonably measure the gap between
two days in terms of the differential ‘7 minutes 23 seconds’. The level of
1:57
Sigrid Beck
granularity relevant for the than-clause has to make sense in relation to the
differential.
(164) a. Main Clause Differential than D

b. for all U ∈ Cov: U < Diff
c. Since Max>,Cov (D) ∈ Cov : Max>,Cov (D) < Diff
d. Max>,Cov (Main Clause) = Diff + Max>,Cov (D)
The reasoning works out given that the cover, and therefore the unit that
counts as ‘maximal point’, is determined locally, i.e. than-clause internally,
independently of the differential which will then either fit or clash.14
I think that granularity offers an explanation for the interpretive effect I
call EQ. Consider the situation depicted below for (165). If we have no further
information regarding the situation, the girls’ sizes can be far apart. This
would indicate a large interval. The idea is that the semantics of the than-
clause itself indicates possible Covers. There is then a danger that we have a
coarse-grained cover. A reasonable division of x1 –x5 would be
into relatively long units, hence Max> is long. This would be incompatible
with the differential — a granularity clash. That is, a sentence in which the
than-clause indicates a real spread (e.g. because of a universal quantifier)
brings with it the danger of a granularity mismatch with a differential.
(165) John is exactly 200 taller than every girl is.

(166) _ _ _• _ _ _• _ _ _ _ _ •_ _ _ •_ _ _ _ _ _• _ _ _ _ _• _ _ _ _/
x1 x2 x3 x4 x5 J

m_inf((than) every girl is tall) = { x1 –x5 }
14 A similar effect can be observed with Covers in the plural domain in examples like (i) below.
(i) a. The women and the men love their child.

b. The Smiths and the Johnsons love their child.
Suppose we are talking about Angelina and Reginald Johnson and Mary and John Smith.
Then the two subjects in (ia) and (ib) refer to the same group, but make different covers
salient (Schwarzschild 1996). By virtue of the cover suggested by the subject, (ia) tends to be
understood as ‘the women love their child and the men love their child’, which is unexpected.
(ia) amounts to ‘the Smiths love their child and the Johnsons love their child’, which is more
expected. The point is that the subject group autonomously makes salient a cover, whether
this leads to a plausible interpretation of the whole or not.
1:58
The Cover indicated by the than-clause may agree with the differential
only under an additional assumption of closeness of the individual “points”
covered by the than-clause interval. My suggestion is that if a potential
granularity clash could only be avoided under an additional assumption of
closeness, one tends to assume equality and a default Cover of the than-
clause interval D in terms of the singleton set {D}. This is the EQ. In short,
without an informative context, there is a danger of a granularity clash. The
danger is avoided by the EQ. The EQ would under this analysis be an extra
assumptions speakers make in order to ensure that a sentence is meaningful.
(Note that the EQ is not the weakest assumption one could make to ensure
that; perhaps it is the simplest assumption.)
The data above for which the selection analysis automatically makes
good predictions with Max> , (154)–(156), are such that we have a rather clear
expectation about the kind of interval denoted by the than-clause — the range
within which the individual degrees fall is fixed. The context is rich, and
no problems with granularity arise. Thus a genuine Max> interpretation (i.e.
one in which we pick out the maximum from a genuine spread) is possible
without further assumptions. This distinguishes those data from our original
example (165). I suggest that danger of a granularity clash leads to EQ: to
supposing that the ‘points’ that are in danger of being spread over too large
an interval in fact collapse into one. We expect that it should depend on
the amount of information available on the interval covered by the than-
clause whether we get an EQ interpretation or a genuine Max> interpretation.
Additional information to the effect that the points are not the same, but
close enough together for the purposes of the differential, may make the EQ
unnecessary and thus make a genuine Max> interpretation possible for our
EQ data. This appears to me to be correct:
(167) Background: we are running an experiment in which we vary the

growth conditions of seedlings. In particular, we test different
fertilizing agents (ViagraFlor, Dung™, ComposFix and GuanoPlus)
and their effect on how fast our seedlings grow. After two weeks, it
is reported that:
(168) The ComposFix seedlings are exactly 200 taller than all the others.
(Max> possible)
Danger of granularity clash arises in uninformative contexts and triggers EQ.

I should be able to take the same than-clauses that occured in Max> examples
1:59
Sigrid Beck
and place them into a less fortunate context, and trigger EQ. Again, this
seems the right prediction.
(169) a. This pot dries out exactly 40 min faster than all the others.
(EQ likely)
b. This T-Shirt dries exactly 20 min faster than all the others.
(EQ likely)
We see that minimal pairs can be found that have essentially the same
comparative (differential plus comparative adjective plus than-clause) but
differ as to informativity of background context regarding the than-clause
interval. An uninformative context makes us assume that the interval is
point-like, so that Max> will be well defined and suitable — EQ. If we have
enough background information to be sure that the Max> unit in the than-
clause interval is suitable, we do not panic, make no extra assumptions, and
can get a genuine Max> interpretation as expected.
Things are different with an existential quantifier. Consider (170) against
the same background as before. The minimal than-clause intervals will be the
heights of the individual girls. Max> will be well defined and suitable without
any additional assumptions, and will make this a comparison between John’s
height and the height of the tallest girl, as desired.
(170) John is exactly 2 cm taller than any girl is.

Max> (m_inf((than) any girl is tall))
(171) _ _ _• _ _ _• _ _ _ _ _ •_ _ _ •_ _ _ _ _ _• _ _ _ _ _• _ _ _ _/
x1 x2 x3 x4 x5 J
I conclude that the selection strategy provides a reasonable perspective

on differential comparatives. It depends on context whether we get an EQ
interpretation or a genuine Max> interpretation, and the selection strategy
can explain this. I will not investiate here what a scope strategy could say
about the data.
A more general remark: At this point in the analysis, a pragmatic element
has entered the picture. The ‘glue’ I have been talking about so far is gen-
uinely semantic and seems fully determined (as far as I can see) given the
requirement of interpretability. But scales (following the insights represented
by Krifka’s work) require reference to context and include a pragmatic ele-
ment in the shape of the cover. In addition to the maximality/informativity
1:60
operators themselves, we need the contextually relevant part/whole structure

of the scale to interpret a particular example. Properties of the cover become
relevant in particular in the presence of differentials, and speakers may be
lead to make extra assumptions (EQ). The fuzzy nature of the data, in my
opinion, speaks in favour of the idea that some kind of pragmatic glue is
required to make things work out. Depending on the context, speakers may
or may not have an easy time figuring out what the necessary glue is. That
said, a remaining caveat is a more thorough empirical understanding of the
data with differentials.
4 Summary and conclusions
4.1 Summary
Building on work primarily by Schwarzschild & Wilkinson and Heim, I propose

an analysis of quantifiers in than-clauses in which the quantifier is interpreted
inside the than-clause. A shift from degrees to intervals of degrees makes
this possible. Despite appearances, there is no scope interaction between
quantifier and shifter or quantifier and comparison operator. Instead, there
is uniformly selection of a point from the subordinate clause interval. The
analysis takes from Schwarzschild & Wilkinson the step to intervals. It shares
with Heim that comparison is ultimately reduced to comparison of points.
Intervals are not directly compared. In contrast to Heim and the subsequent
NOT-theory, apparent scope effects like the interpretation of have to–type
modals and exactly n NPs have been explained away via recourse to alternative
interpretational mechanisms, which have been argued for independently of
than-clauses (in these two examples: exhaustification and an alternative
semantics for exactly-numerals). My strategy is motivated by the lack of clear
scope interaction in than-clauses.
One feature of the proposal is that the semantics of the comparative
operator is very simple. It is the same semantics that one needs for data like
(172a), namely one in which the first argument of the comparative operator is
a degree, (172c). Maximality is still used in clausal comparatives like the ones
we have discussed, but it is independent of the comparative operator.
(172) a. John is taller than 1.70 m.

b. [[-er [than 1.70 m]] [2 [John is t2 tall]]]
c. -er = λd1 . λd2 . d2 > d1
1:61
Sigrid Beck
It is in this sense the analysis developed here is in my opinion ‘simpler’

than Schwarzschild & Wilkinson’s. The complexity that is no doubt there
in the present analysis consists in the assumption that general interpretive
strategies like informativity and maximality are involved (plus in indepen-
dent complications like the availability of specific readings for indefinites
and the like). Also, the semantics is no longer completely determined by
compositional semantics. Data with differentials could only be analysed by
enriching the classical semantics with pragmatic notions (covers, contextual
background). However, this aspect of the proposal is supported by contextual
variability of the judgements and thus has to be part of a successful analysis.
In order to ultimately evaluate the success of my proposals, the whole
approach needs to also be extended to adverbials. I will not attempt to do so
now. Other considerations concern a more detailed analysis of the various
modals (including might) and an investigation of the interaction of several
scope bearing elements inside a than-clause. I give some representative data
below and acknowledge the need for further work on the subject (compare
Schwarzschild & Wilkinson 2002, Heim 2006b, Schwarzschild 2008). Finally, I
admit that I have no analysis for Sauerland’s (2008) example (174), for which
he provides a solution in terms of Heim’s theory.
(173) a. It is hotter here today than it often is in New Brunswick.

b. It is hotter today than it might be tomorrow.
c. Sveta solved this problem faster than someone else could have.
(174) Ekaterina is an odd number of centimeters taller than each of her
teammates.
These issues are left for future work.
4.2 Where do the intervals come from?
There is one important theoretical question left for the intervals-plus-selection

analysis to answer: where do the intervals come from? In Section 3 I made
the assumption that basic adjective meanings already contained intervals:
(175) tall = [λD. λx. Height(x) ∈ D]
I could alternatively have assumed that the operator Pi from Heim 2006b
shifts the standard adjective meaning to (175).
1:62
(176) Pi shifts from degrees to intervals: [1 [ AP [Pi t1 ] [3 [ AP t2 is t3 tall]]]]

(177) a. tall = [λd. λx. Height(x) ≤ d]
b. Pi = [λD. λP . max(P ) ∈ D]
c. [λD. Pi(D)(tall(x))] = [λD. Height(x) ∈ D]
Since Pi on the analysis pursued here always takes scope immediately next to
the adjective, this would have served no particular purpose and I simplified
to (175). But a problem for assuming (175) as the basic meaning of a gradable
adjective is that it is very weak. This creates problems for example for the
negation theory of antonymy (compare e.g. Heim 2006a). (178a) analyses
the negative polar adjective short as the negation of tall. I fail to be able to
imagine how a parallel strategy for the interval based meaning (178b) could
be successful.
(178) a. short = [λd. λx. ¬ Height(x) ≥ d] = [λd. λx. Height(x) < d]

b. short = λD. λx. Height(x) 6∈ D
So if the intervals do not come into the semantics via a motivated independent
(since mobile) operator Pi, and nor are they plausibly basic, how do they come
in? It would be attractive to say that intervals enter the semantics because,
that is, if and only if, they are needed. That is what I would like to think, and
(175) really was a simplification for the sake of uniformity that I think of as
preliminary.
An idea for how to bring intervals into the semantics when needed that is
due to Heim (2009) is given below. We begin by observing that a relation can
be expressed between a plurality and a part of a scale — a degree ‘blob’.
(179) a. (You have to be 50 tall to enter.)

Our children are that tall.
b. (Bill’s GPA is 3.75.)
Sam’s grades are that good, too.
We see a parallel to expressing a relation between a plurality and a mass noun.

The example (180a) can be represented as in (180b) with the meaning in (180c)
in mind for the relation between the two objects of drink — a cumulative
interpretation (see e.g. Beck & Sauerland 2000 and all the earlier work cited
there that they rely on).
(180) a. Our children drank the milk.

b. ∗ ∗ drank(M)(C)
1:63
Sigrid Beck
c. ∀x ≤ C : ∃y ≤ M : drank(y)(x)&∀y ≤ M : ∃x ≤ C : drank(y)(x)
All children participated in drinking the milk, and all parts of

the milk were drunk by one of the children.
Transferring the analysis to our degree example yields (181).
(181) a. Our children are that tall.

b. ∗ ∗ tall(D)(C)
c. ∀x ≤ C : ∃d ≤ D : tall(d)(x) & ∀d ≤ D : ∃x ≤ C : tall(d)(x)
All the children’s heights fall into D, and all parts of D contain
the height of a child.
It is easy to apply the same analysis to a than-clause containing a definite

plural, and it yields the set of intervals that we need according to the analysis
in Section 3. Comparison will be with the maximum point in that set and the
sentence is predicted to mean that our children are shorter than John.
(182) a. (John is taller) than our children are.

b. λD. ∗ ∗ tall(D)(C)
c. λD. ∀x ≤ C : ∃d ≤ D : tall(d)(x)&∀d ≤ D : ∃x ≤ C : tall(d)(x)
intervals that contain the heights of all our children (and nothing
else)
Note that the notion of degree ‘blobs’ that have a part/whole structure is
anticipated by the reference to covers in Section 3. A cover provides us
with the relevant parts of the degree scale. We are consistently assuming a
mass like structure of the degree scale. To make the connection clear, (1820 )
provides a more complete formalisation of (180a) which includes covers
(compare Beck 2001 for this kind of use for covers).
(1820 ) a. λD. [∗ ∗ λd. λx. d ∈ Cov &x ∈ Cov & tall(d)(x)](D)(C)

b. λD. ∀x[x ≤ C & x ∈ Cov → ∃d[d ≤ D & d ∈ Cov & tall(d)(x)]] &
∀d[d ≤ D & d ∈ Cov → ∃x[x ≤ C & x ∈ Cov & tall(d)(x)]]
(suppose that the relevant parts of ‘the children’ are the indi-
vidual children, and that the relevant parts of the cover are the
units according to granularity)
Example (182)/(1820 ) derives a set of intervals, as pluralities of degrees, as

the meaning of a than-clause via plural predication. What would we need
1:64
to do in order for this idea to apply to the range of data examined in this
paper? I briefly discuss three issues for which this change in perspective is
reelvant: (i) universal quantifiers, (ii) singular quantifiers, and (iii) maximal
informativity.
First, regarding universal quantifiers: The introduction of intervals analo-
gously to (182) would have to happen with universal quantifiers of various
kinds, in particular universal nominals and intensional verbs (cf. our two
representative examples every girl and predict). Regarding intensional verbs,
there is a proposal by Bošković & Gajewski (2008) that instead of universal
quantification over worlds (183a) they (or at least some of them) involve sum
formation (183b).
(183) a. believex = λp. ∀w[w ∈ BELx → p(w)]

b. believex = max(λW . W ∈ ∗BELx )
This makes possible the following analysis of a than-clause with an inten-

sional verb (in the simpler version without covers):
(184) a. (John is taller) than you believe.

b. λD.[∗∗λw.λd. John is d-tall in w](max(λW .W ∈ ∗BELyou ))(D)
c. λD.∀w ≤ max(λW .W ∈ ∗BELyou ) : ∃d ≤ D : tall(w)(d)(John)&
∀d ≤ D : ∃w ≤ max(λW .W ∈ ∗BELyou ) : tall(w)(d)(John)
intervals that contain John’s height in all your belief worlds (and
nothing else)
Nominal universal quantifiers, it has been observed, can sometimes be used to

introduce a plurality, although this is not always easily possible. Perhaps (185)
involves a reinterpretation as a plural definite NP. The same reinterpretation
would be responsible for the interpretation of the than-clause in (186) in
case the girls are of varying heights. This might make sense of my above-
mentioned intuition that a definite plural is more acceptable than a universal
NP.
(185) a. Everyone gathered in the hallway.

b. ?Every student gathered in the hallway.
b. ‘every girl’ → G (the plurality of girls)
c. λD.∗∗ tall(D)(G)
1:65
Sigrid Beck
h i
d. λD. ∀x ≤ G : ∃d ≤ D : tall(d)(x)
h i
& ∀d ≤ D : ∃x ≤ G : tall(d)(x)
intervals that contain the heights of all the girls (and nothing
else)
Thus it can be argued that a plural analysis of intervals can capture these
data15 The discussion from Section 3 is (almost — see below) unchanged; what
changes is what happens below the level of AP, so to speak (the predication ‘x
is d-tall’): what we assumed to be basic in (175) is now compositionally derived
via pluralization mechanisms. Next, let’s reconsider data with singular
quantificational elements:
(187) a. Mary is taller than anyone else is.

b. *John is taller than no girl is.
(1870 ) a. John is taller
h than some girls are. i
b. λD. ∃X : ∀x ≤ X : ∃d ≤ D : tall(d)(x)
h i
& ∀d ≤ D : ∃x ≤ X : tall(d)(x)
h i
c. λD. ∀x ≤ f (∗ girl) : ∃d ≤ D : tall(d)(x)
h i
& ∀d ≤ D : ∃x ≤ f (∗ girl). tall(d)(x)
There would be no reason to introduce intervals in the data with singular

indefinites and negative quantifiers. Remember from Section 3.1 that in these
cases, we got rid of the intervals immediately anyway (maximal informativity
reduced the contribution of the than-clause to the set of individual heights).
Now, we could just revert to the classical analysis for those data. This
is not an unwelcome result, since the classical analysis offers a successful
solution for them. Pluralization as the trigger for the introduction of intervals
will continue to play a role for plural indefinites (see example (1870 )); the
discussion in Section 3.3 is thus also in important respects unchanged.
Finally, we need to think once more about the role of maximal infor-
mativity. Plural semantics keeps intervals small. The truth conditions of
cumulation are such that the pluralised relation holds between the plurality
and the smallest interval that covers all the individuals in the plurality (cf.
the second conjunct in (181c) and the following analyses). This may make
15 I am not sure at this point what to say about the have to–type modals. Perhaps (as non-neg-
raising verbs) they do not have a plural analysis. We then revert to the classical analysis.
If they do have a plural semantics, the story in Section 3.1 is maintained. The first version
relates the behavior of a modal to neg-raising, the second to SMC use.
1:66
m_inf unnecessary, leaving us with iterated maximality. Again this can be

seen as a welcome result.
The attraction of this approach is, as said above, that intervals enter
the picture only when there is a real need for them. The idea is entirely
compatible with the selection analysis and in my view very desirable. Why
did I not set out in this fashion in Section 3? I am not quite confident enough
of the story in (185), (186), and too many details remain to be worked out,
plus the data need to be examined more carefully. As things stand, readers
sceptical of the ideas sketched in this subsection may take Section 3 as it is,
while others have the beginnings of an analysis of how and why intervals
come into play at all.
4.3 Outlook
Let’s take a step back and think about what an analysis of quantifiers in
than-clauses in terms of selection achieves — beyond the empirical coverage
of the mostly well-known set of data that I have been concerned with above.
Compared to its theoretical competitors, it primarily removes quantifiers
in than-clauses from the realm of scope interaction phenomena. For example,
the interpretive behaviour of quantifiers in than-clauses cannot be seen as
an instance of the Heim/Kennedy generalization (Kennedy 1997; Heim 2001).
The analysis I’ve given in Section 3 violates this generalization.
(188) Heim/Kennedy generalization: [ DegP . . . [ QP [. . . tDegP . . . ] . . . ]]

(189) a. than [1 [every girl is t1 tall]]
b. λD. for every girl x : Height(x) ∈ D
The Heim/Kennedy generalization is motivated in particular by quanti-

fiers in the matrix clause of comparatives. Suppose that the behaviour of
quantifiers in the matrix clause relative to degree operators is regulated by
a scope constraint deriving the Heim/Kennedy generalization. Then there
would be no theoretical connection between this and than-clause quantifiers.
We would accordingly expect empirical differences between quantifiers in
main clause vs. than-clause. On the other hand, if one were to extend the
requirement of finding a definite degree from the than-clause to the main
clause (a good way of ensuring applicability of the lexical entry in (172c),
note), a parallel analysis could still be pursued. (See once more Heim 2009
for a sketch of such an analysis.) There are some striking similarities be-
tween main clause and than-clause quantifiers that motivate such a step, in
1:67
Sigrid Beck
particular (190), (191) below: Both sentences in (190) have an interpretation

that talks about the minimum requirement length of the paper, and neither
sentence in (191) does.
(190) a. The paper is longer than it is required to be.

b. The paper is required to be less long than that.
(191) a. The paper is longer than it is supposed to be.
b. The paper is supposed to be less long than that.
But there are also apparent mismatches:
(192) a. Hier ist es schöner als anderswo.

here is it nicer than elsewhere
‘It is nicer here than it is elsewhere.’
b. ok: It is nicer here than it is in the most beautiful other place.
(193) a. Anderswo ist es weniger schön als hier.
elsewhere is it less nice than here
‘It is less nice elsewhere than it is here.’
b. ??The most beautiful other place is less nice than it is here.
(194) a. Sam war schneller als jemand anderes.
Sam was faster than someone other
‘Sam was faster than another person.’
b. ok: Sam was faster than the fastest other person.
(195) a. Jemand anderes war weniger schnell als Sam.
Someone other was less fast than Sam
‘Another person was less fast than Sam.’
b. ??The fastest other person was less fast than Sam.
At this point, I do acknowledge interesting empirical parallels, but I am also

worried about apparent differences. I would not wish to be committed at
present to claiming that quantifiers in the main clause behave in the same
way as quantifiers in the than-clause, or that they don’t, and will remain
neutral as to whether the analysis developed here should be extended to
cover matrix clause quantifiers as well.
Instead of making a connection to scope interaction phenomena, the
present analysis is based on a plural/mass-semantics related vagueness plus
semantic and pragmatic glue. It makes the interpretation of quantifiers in
than-clauses more of a coercion-like phenomenon. Perhaps the variable and
partly messy nature of the data can motivate the nature of the analysis.
1:68
References
Beck, Sigrid. 2001. Reciprocals are definites. Natural Language Semantics

9(1). 69–138. doi:10.1023/A:1012203407127.
Beck, Sigrid. 2009. Comparatives and superlatives. To appear in Klaus
von Heusinger, Claudia Maidenborn, and Paul Portner (eds.), Handbook
of semantics: An international handbook of natural language meaning.
Berlin: Mouton de Gruyter.
Beck, Sigrid & Hotze Rullmann. 1999. A flexible approach to ex-
haustivity in questions. Natural Language Semantics 7(3). 249–298.
doi:10.1023/A:1008373224343.
Beck, Sigrid & Uli Sauerland. 2000. Cumulation is needed: A re-
ply to Winter 2000. Natural Language Semantics 8(4). 349–371.
doi:10.1023/A:1011240827230.
Bošković, Željko & Jon Gajewski. 2008. Semantic correlates of the NP/DP
parameter. Proceedings of the North East Linguistics Society 39. URL
http://gajewski.uconn.edu/papers/NELS39paper.pdf.
Cresswell, Max J. 1977. The semantics of degree. In Barbara H. Partee (ed.),
Montague grammar, 261–292. Academic Press.
Dalrymple, Mary, Makoto Kanazawa, Yookyung Kim, Sam Mchombo & Stan-
ley Peters. 1998. Reciprocal expression and the concept of reciprocity.
Linguistics and Philosophy 21(2). 159–210. doi:10.1023/A:1005330227480.
Endriss, Cornelia. 2009. Quantificational topics: A scopal treatment of ex-
ceptional wide scope phenomena (Studies in Linguistics and Philosophy
(SLAP) 86). Springer. doi:10.1007/978-90-481-2303-2.
von Fintel, Kai & Sabine Iatridou. 2005. What to do if you want to go to
Harlem: Anankastic conditionals and related matters. URL http://mit.
edu/fintel/fintel-iatridou-2005-harlem.pdf. Ms, MIT.
Fox, Danny. 2007. Free choice and the theory of scalar implicatures. In
Uli Sauerland & Penka Stateva (eds.), Presupposition and implicature in
compositional semantics, 537–586. New York: Palgrave Macmillan.
Fox, Danny & Martin Hackl. 2006. The universal density of measurement.
Linguistics and Philosophy 29(5). 537–586. doi:10.1007/s10988-006-9004-4.
Gajewski, Jon. 2008. More on quantifiers in comparative clauses. Proceedings
of Semantics and Linguistic Theory 18. doi:1813/13043.
Hackl, Martin. 2001a. Comparative quantifiers. Ph.D. thesis, Massachusetts
Institute of Technology. URL http://hdl.handle.net/1721.1/8765.
Hackl, Martin. 2001b. A comparative syntax for comparative quantifiers.
1:69
Sigrid Beck
Proceedings of the North East Linguistics Society 31.

Hackl, Martin. 2009. On the grammar and processing of proportional quan-
tifiers: most versus more than half. Natural Language Semantics 17(1).
63–98. doi:10.1007/s11050-008-9039-x.
Heim, Irene. 1982. The semantics of definite and indefinite noun phrases.
Ph.D. thesis, University of Massachusetts at Amherst. URL http://
semanticsarchive.net/Archive/Tk0ZmYyY.
Heim, Irene. 1994. Interrogative semantics and Karttunen’s semantics for
know. In Rhonna Buchalla & Anita Mittwoch (eds.), The proceedings of
the conference of the Israel Association for Theoretical Linguistics (IATL 1),
128–144. Hebrew University of Jerusalem. URL http://semanticsarchive.
net/Archive/jUzYjk1O.
Heim, Irene. 2001. Degree operators and scope. In Caroline Féry & Wolfgang
Sternefeld (eds.), Audiatur vox sapientiae: A festschrift for Arnim von
Stechow, 214–239. Berlin: Akademie Verlag.
Heim, Irene. 2006a. Little. Proceedings of Semantics and Linguistic Theory 16.
doi:1813/7579.
Heim, Irene. 2006b. Remarks on comparative clauses as generalized quanti-
fiers. URL http://semanticsarchive.net/Archive/mJiMDBlN. Ms, MIT.
Heim, Irene. 2009. A unified account? Handout for ‘Topics in Semantics’,
MIT.
Heim, Irene & Angelika Kratzer. 1998. Semantics in generative grammar.
Oxford: Blackwell.
Hellan, Lars. 1981. Towards an integrated analysis of comparatives (Ergebnisse
und Methoden moderner Sprachwissenschaft 11). Tübingen: Narr.
Hoeksema, Jack. 1983. Negative polarity and the comparative. Natural
Language and Linguistic Theory 1(3). 403–434. doi:10.1007/BF00142472.
Jacobson, Pauline. 1995. On the quantificational force of English free relatives.
In Emmon Bach, Eloise Jelinek, Angelika Kratzer & Barbara H. Partee (eds.),
Quantification in natural languages (Studies in Linguistics and Philosophy
(SLAP) 54), 451–486. Dordrecht: Kluwer.
Kennedy, Chris. 1997. Projecting the adjective: The syntax and semantics of
gradability and comparison. Ph.D. thesis, University of California, Santa
Cruz.
Klein, Ewan. 1991. Comparatives. In von Stechow & Wunderlich (1991),
chap. 32, 673–691.
Krasikova, Sveta. 2008. Quantifiers in comparatives. Proceedings of Sinn
und Bedeutung 12. 337–352. URL http://www.hf.uio.no/ilos/forskning/
1:70
konferanser/SuB12/proceedings/krasikova_337-352.pdf.
Krasikova, Sveta & Ventsislav Zhechev. 2006. You only need a scalar only. Pro-
ceedings of Sinn und Bedeutung 10. URL http://www.sfb441.uni-tuebingen.
de/b10/Pubs/KrasikovaZhechev_SuB05.pdf.
Kratzer, Angelika. 1991. Modality. In von Stechow & Wunderlich (1991),
639–650.
Kratzer, Angelika. 1998. Scope or pseudoscope? are there wide-scope in-
definites? In Susan Rothstein (ed.), Events and grammar. Dordrecht:
Kluwer.
Krifka, Manfred. 1999. At least some determiners aren’t determiners. In Ken
Turner (ed.), The semantics/pragmatics interface from different points of
view (Current Research in the Semantics/Pragmatics Interface 1), 257–291.
Elsevier.
Krifka, Manfred. 2007. Approximate interpretation of number words: A case
for strategic communication. In Gerlof Bouma, Irene Maria Krämer &
Joost Zwarts (eds.), Cognitive foundations of interpretation (Verhandelin-
gen der Koninklijke Nederlandse Akademie van Wetenschappen, Afd.
Letterkunde 190), 111–126. Amsterdam: Royal Netherlands Academy of
Arts and Sciences.
Larson, Richard K. 1988. Scope and comparatives. Linguistics and Philosophy
11(1). 1–26. doi:10.1007/BF00635755.
Link, Godehard. 1983. The logical analysis of plurals and mass terms: A
lattice-theoretical approach. In Rainer Bäuerle, Christoph Schwarze &
Arnim von Stechow (eds.), Meaning, use, and interpretation of language,
Grundlagen der Kommunikation und Kognition, 302–323. de Gruyter.
May, Robert. 1985. Logical form: Its structure and derivation (Linguistic
Inquiry Monographs 12). Cambridge, MA: MIT Press.
Meier, Cécile. 2002. Maximality and minimality in comparatives. Sinn und
Bedeutung 6. 275–287. URL http://www.phil-fak.uni-duesseldorf.de/asw/
gfs/common/procSuB6/pdf/articles/MeierSuB6.pdf.
Partee, Barbara H. 1984. Compositionality. In Fred Landman & Frank Veltman
(eds.), Varieties of formal semantics (Groningen-Amsterdam Studies in
Semantics (GRASS) 3), 281–311. Dordrecht: Foris.
Reinhart, Tanya. 1992. Wh-in-situ: An apparent paradox. Proceedings of the
Amsterdam Colloquium 8. 483–492.
van Rooij, Robert. 2008. Comparatives and quantifiers. Empirical Issues in
Syntax and Semantics 7. 423–444. URL http://www.cssp.cnrs.fr/eiss7/
van-rooij-eiss7.pdf.
1:71
Sigrid Beck
Rullmann, Hotze. 1995. Maximality in the semantics of wh-constructions. Ph.D.

thesis, University of Massachusetts at Amherst. URL http://scholarworks.
umass.edu/dissertations/AAI9524743/.
Sauerland, Uli. 2008. Intervals have holes: A note on comparatives with
differentials. Ms, ZAS Berlin.
Schwarzschild, Roger. 1996. Pluralities (Studies in Linguistics and Philosophy
(SLAP) 61). Kluwer.
Schwarzschild, Roger. 2004. Scope splitting in the comparative. URL http:
//www.rci.rutgers.edu/~tapuz/MIT04.pdf. Handout from a colloquium
talk at MIT.
Schwarzschild, Roger. 2008. The semantics of comparatives and other de-
gree constructions. Language and Linguistics Compass 2(2). 308–331.
doi:10.1111/j.1749-818X.2007.00049.x.
Schwarzschild, Roger & Karina Wilkinson. 2002. Quantifiers in comparatives:
A semantics of degree based on intervals. Natural Language Semantics
10(1). 1–41. doi:10.1023/A:1015545424775.
Seuren, Pieter A.M. 1978. The structure and selection of positive and negative
gradable adjectives. In Donka Farkas, Wesley M. Jacobsen & Karol W.
Todrys (eds.), Papers from the Parasession on the Lexicon, Chicago Lin-
guistic Society, April 14–15, 1978 (CLS 14), 336–346.
von Stechow, Arnim. 1984. Comparing semantic theories of comparison.
Journal of Semantics 3(1-2). 1–77. doi:10.1093/jos/3.1-2.1.
von Stechow, Arnim. 1995. Lexical decomposition in syntax. In Urs Egli,
Peter E. Pause, Christoph Schwarze, Arnim von Stechow & Götz Wienold
(eds.), Lexical knowledge in the organization of language (Current Issues
in Linguistic Theory 114), 81–118. John Benjamins.
von Stechow, Arnim & Dieter Wunderlich (eds.). 1991. Semantics: An interna-
tional handbook of contemporary research. Berlin: de Gruyter.
Prof. Dr. Sigrid Beck

Chair of Descriptive and Theoretical Linguistics
Englisches Seminar
Universität Tübingen
Wilhelmstr. 50
72074 Tübingen
Germany
sigrid.beck@uni-tuebingen.de
1:72
doi: 10.3765/sp.3.3
Two kinds of modified numerals∗

Rick Nouwen
Utrecht University

Decision 2009-09-08 / Revised 2009-09-29 / Accepted 2009-10-14 / Final Version
Received 2009-10-15 / Published 2010-01-26
Abstract In this article, I show that there are two kinds of numeral modifiers:
(Class A) those that express the comparison of a certain cardinality with the
value expressed by the numeral and (Class B) those that express a bound
on a degree property. The goal is, first of all, to provide empirical evidence
for this claim and second to account for these data within a framework that
treats modified numerals as degree quantifiers.
Keywords: modified numerals, scalar quantification, modality
1 Introduction
Modified numerals are most commonly exemplified by combinations of a

numeral and a comparative, as in more than 100. Following Hackl (2001),
I will refer to such expressions as comparative quantifiers. As (1) shows,
however, apart from modification by a comparative, numerals combine with
a striking diversity of expressions.
(1) more/fewer/less than 100 comparative quantifiers

no more than 100, many more than 100 differential quantifiers
∗ I would like to thank two anonymous reviewers for their helpful comments. Many thanks,
moreover, to S&P editors Kai von Fintel and, especially, David Beaver, for their painstaking
efforts to point out ways in which to improve the article. A concise presentation of the main
points of this article appeared under the same title in the proceedings of the thirteenth
Sinn und Bedeutung conference (Nouwen 2009). Earlier ideas on this subject were presented
at Semantics and Linguistic Theory 13 in Amherst (2008) and the Journées Sémantique et
Modélisation in Toulouse (2008). I am grateful to the audiences of these events for useful
discussion. Special thanks to Min Que and Luisa Meroni for some help with data. This work
was supported by a grant from the Netherlands Organisation for Scientific Research NWO,
which I hereby gratefully acknowledge.
©2010 R.W.F. Nouwen

R.W.F. Nouwen
at least/most 100 superlative quantifiers

100 or more/fewer/less disjunctive quantifiers
under/over 100, between 100 and 200 locative quantifiers
from/up to 100, from 100 to 200 directional quantifiers
minimally/maximally 100, 100 tops other
For a long time, there seemed to be agreement in the formal semantic lit-
erature that there was little to be gained from a thorough investigation of
these expressions. An especially dominant view, originating from generalised
quantifier theory (Barwise & Cooper 1981), was that there was not much more
to the semantics of such quantifiers than the expression of the numerical
relations >, <, ≤ and ≥. In the past decade, however, several studies have
shown that this is an overly simplistic assumption. Examples are Hackl 2001,
Krifka 1999 and Takahashi 2006 on comparative quantifiers, Nouwen 2008b
on negative comparative quantifiers, Solt 2007 on differential quantifiers,
Geurts & Nouwen 2007, Umbach 2006, Corblin 2007, Büring 2008 and Krifka
2007b on superlative quantifiers, Corver & Zwarts 2006 on locative quan-
tifiers and Nouwen 2008a on directional quantifiers.1 Such investigations
usually concern the specific quirks of a certain type of modified numeral.
While I believe that it is important to have a semantic analysis of modified
numerals on a case by case basis, I also believe that what is lacking from the
literature so far is a view of to what extent the various modified numerals in
(1) involve the same semantic structures. In this paper, I will attempt to reach
a generalisation along this line by claiming that there are two kinds of modi-
fied numerals: (A) those that relate the numeral to some specific cardinality
and (B) those that place a bound on the cardinality of some property. The
difference will be made clear below. The main example of (A) are comparative
quantifiers like more/fewer than 100. Most other kinds of modified numerals
fall in the second class.
I will start by making clear what distinguishes the two classes of modified
numerals by presenting a body of data that sets them apart. Then, in section
3, I introduce a well-founded decompositional treatment of comparative
quantifiers, proposed by Hackl (2001), which I take to represent the proper
treatment of class A modifiers. In section 4, I propose that class B modifiers
are operators that indicate maxima/minima. I will then account for the
distribution of these quantifiers by arguing that they are often blocked by
unmodified numerals, which are capable of expressing equivalent meanings.
1 See also Nouwen 2010b for an overview.
3:2
Two kinds of modified numerals
Section 5 discusses a particular problem that occurs with the interaction of

B-type quantifiers with modal operators. In section 6, I provide some more
details on the empirical basis for the A/B distinction. Section 7 concludes.
2 Class A and class B modified numerals
It is a striking feature of comparative quantifiers that they can be used to

assert extremely weak propositions. For instance, (2) is acceptable, even
though it expresses a rather under-informative truth.
(2) A hexagon has fewer than 11 sides. A
This example contrasts strongly with the examples in (3), which are all
unacceptable. (Or, alternatively, one might have the intuition that they are
false).
(3) a. #A hexagon has at most 10 sides. B

b. #A hexagon has maximally 10 sides. B
c. #A hexagon has up to 10 sides. B
Why is this so? A naive theory might have it that (2) states that the number
of sides in a hexagon is strictly smaller than 11 (i.e. <11), and that the only
difference with (3) is that, there, it is stated that this number is smaller or
equal to 10 (i.e.≤ 10). Clearly, 6 is both < 11 and ≤ 10. So why are not both
kinds of examples under-informative but true? On the naive view, having at
most 10 sides is expected to be equivalent to having fewer than 11 sides. That
is, both these properties pick out objects with n ∈ {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
sides. Semantically, no contrast is to be expected. Given this semantic
equivalence, a pragmatic explanation of the contrast between (2) and (3)
seems equally unlikely.2
Let us call quantifiers that are acceptable in such examples class A quan-
tifiers and those that are like (3) class B quantifiers. As the contrast between
(4) and (5) shows, the distinction is also visible with lower bound quantifiers.
2 A reviewer wondered whether the naive view could not be maintained if we assume that
there is a pragmatic effect associated to the fact that ≤ n includes the possibility of n while
< n excludes it. It is very much unclear what kind of effect that would be, however. One
could, for instance, base a pragmatic inference on the fact that, in (3a), the speaker seems to
signal the possibility that a hexagon has 10 sides by using at most 10. However, one could
equally argue that the same signal is given by the speaker of (2), simply by using fewer than
11 instead of fewer than 10.
3:3
R.W.F. Nouwen
That is, (4) is under-informative, yet true and acceptable, while the examples
in (5) are unacceptable/false.
(4) A hexagon has more than 3 sides. A

(5) #A hexagon has {at least / minimally} 3 sides. B
What I think is the underlying problem of examples involving class B ex-

pressions is that such quantifiers are incapable of expressing relations to
definite amounts. Class A expressions, on the other hand, excel at doing so.
Imagine, for instance, that we are talking about my new laptop and that we
are concerned with how much internal memory it has. Say that it has 1GB of
memory (and that I know that it has so much memory.) I can then assert (6)
in a context where you, for instance, just told me that your laptop has 2GB of
memory.
(6) My laptop has less than 2GB of memory.
Or, if your computer has a mere 512MB of memory, I can boast that:
(7) My laptop has more than 512MB of memory.
In these examples, I am comparing the definite amount of 1GB, i.e. the precise
amount of memory I know my laptop has, to some given contrasting amount
2GB (512MB) by means of less than (more than). This is something class A
quantifiers can do very well, but something that is unavailable for class B
modified numerals:
(8) I know exactly how much memory my laptop has. . .

a. . . . and it is {#at most / #maximally / #up to} 2GB.
b. . . . and it is {#at least / #minimally} 512MB.
In contrast to (8), class B quantifiers are acceptable when what is ‘under

discussion’ is not a definite amount, but rather a range of amounts, as in (9).
(9) a. Computers of this kind have {at most / maximally / up to} 2GB of
memory.
b. Computers of this kind have {at least / minimally} 512MB of mem-
ory.
3:4
In other words, it appears that class B quantifiers relate to ranges of values,

rather than to a single specific cardinality.3 This intuition is supported by
(10).
(10) Jasper invited maximally 50 people to his party.
We normally interpret (10) to indicate that the speaker does not know how
many people Jasper invited. That is, it is unacceptable for a speaker to utter
(10) if s/he has a definite amount in mind, which is why the addition of 43, to
be precise in (11) is infelicitous.4
(11) Jasper invited maximally 50 people to his party. #43, to be precise.
By assuming that the speaker does not know the exact amount, (10) is
interpreted as being about the range of values possible from the speaker’s
perspective. The speaker thus states that there is a bound on that range.
The same intuition occurs if we substitute maximally 50 by any other class B
quantifier.
In sum, I showed that the landscape of modified numerals can be divided
into two separate classes of expressions. What distinguishes class B quanti-
fiers from other modified numerals is that they are incompatible with definite
amounts and are always interpreted with respect to a range of values. Below,
I will present a semantics of class B expressions that makes this intuition
3 In his comments on this article, David Beaver pointed out examples like (i), where the number
appears to be a variable quantified over.
(i) There were maximally 50 people there at any one time.
Although I will not attempt a compositional analysis of cases like (i), such examples do
appear to support the main intuition that class B quantifiers express relations between
amounts and ranges. An example like (i) states that 50 is the maximum of the range formed
by the different number of people present at different times. This is different from (ii),
which states that at any time the number of people present did not exceed 50. (This is true,
for instance, in case from start to finish there were always 20 people present.) So while (i)
expresses a maximum on a range of values created by quantification, (ii) quantifies over
different times and compares the number of people present at that time with 50.
(ii) There were fewer than 50 people there at any one time.
4 Compare this to (i), which forms a minimal pair with (11).
(i) Jasper invited fewer than 50 people to his party. 43, to be precise.
3:5
R.W.F. Nouwen
precise. Before I can do so, however, I will need to discuss the semantics of
A-type numeral modifiers.
3 Hackl’s semantics for comparative modifiers
In this section, I discuss the semantics for comparative modified numerals

as developed in Hackl 2001. I will assume that this represents the proper
treatment of class A numeral modifiers. I also extend the framework slightly
by adding a way to account for the ambiguity of non-modified numerals.
3.1 Class A modifiers as degree quantifiers
What is the semantics of a class A quantifier? It is tempting to think that

class A quantifiers correspond to the well-known generalised quantifier-style
determiner denotations such as the ones in (12).5
(12) more than 10 = λP .λQ. ∃x[#x > 10 & P (x) & Q(x)]
fewer than 10 = λP .λQ.¬∃x[#x ≥ 10 & P (x) & Q(x)]
In the past decade it has become clear that it is important to have a closer
look at these modified numerals (Krifka 1999; Hackl 2001). In what follows,
I will assume the following semantics of fewer than, which is based on the
arguments in Hackl 2001.
(13) more than 10 = λM. maxn (M(n)) > 10

fewer than 10 = λM. maxn (M(n)) < 10
The workings of this definition will become clear below, but one of the main
motivations for an analysis along this line can be pointed out immediately.
The semantics in (13) is simply that of a comparative construction, where car-
dinalities are seen as a special kind of degrees. That is, like the comparative,
it involves a degree predicate M and a maximality operator that applies to
5 In a set-theoretic approach (12) would correspond to the perhaps more familiar (i). I discuss
(12) rather than (i) since, in what follows, I will assume a framework that makes use of
sum individuals. It is easy to see that, within their own respective frameworks, (12) and (i)
ultimately yield the same truth-conditions.
(i) more than 10 = λX.λY .|X ∩ Y | > 10

fewer than 10 = λX.λY .|X ∩ Y | < 10
3:6
this predicate (Heim 2000). In other words, (13) is completely parallel to other
comparatives, like (14). While in (13), M is a predicate like being a number n
such that Jasper invited n people to his party, in (14) M could, for instance,
be filled in with something like being a degree d such that Jasper is tall to
degree d.
(14) -er than d = λM.maxd0 (M(d0 )) > d
Hackl assumes that argument DPs containing a (modified) numeral always

contain a silent counting quantifier many:
(15) many = λnλP λQ.∃x[#x = n & P (x) & Q(x)]

(16) 10 sushis [ DP [ 10 many ] sushis ]
In this framework, the numeral (of type d, of degrees) is an argument

of the silent quantifier many (of type hd, hhe, ti, hhe, ti, tiii, of generalised
quantifier-style determiners parameterised for degrees). By applying [ 10
many ] to the noun (phrase), the standard generalised quantifier denotation
of 10 sushis is derived: λQ.∃x[#x = 10 & sushi(x) & Q(x)]. The structure
of a DP containing a modified numeral does not differ essentially. Modified
numerals are also the argument of a counting quantifier, as illustrated in (17).
(17) fewer than 10 sushis [ DP [ [ fewer than 10 ] many ] sushis]
As was stated above, many is parametrised for cardinalities, which we take

to be degrees. Fewer than 10, however, denotes a degree quantifier, not a
degree constant. Thus, to avoid a type clash, the modified numeral in (17) has
to move, leaving a degree trace and creating a degree property.
(18) Jasper ate fewer than 10 sushis.

[ [fewer than 10] [ λn [ Jasper ate [ [ n many ] sushis ] ] ] ]
This leads to the following interpretation, which results in the desired simple
truth-conditions.
(19) [λM.maxn (M(n)) < 10] ( λn.∃x[#x = n & sushi(x) & ate(j, x)])
=β
maxn (∃x[#x = n & sushi(x) & ate(j, x)]) < 10
This might seem like a rather elaborate way of deriving the truth-conditions
for such simple sentences. Using (12), we would have derived as truth-
3:7
R.W.F. Nouwen
conditions ¬∃x[#x ≥ 10 & sushi(x) & ate(j, x)], which is equivalent to (19),
but which does not require resorting to (moving) degree quantifiers and
silent counting quantifiers. Importantly, however, Hackl’s theory makes some
crucial predictions which are not made by theories assuming a semantics as
in (12).
If, like degree operators, modified numeral operators can take scope,
we expect to find scope alternations that resemble those found with degree
operators (Heim 2000). As Hackl observed, this prediction is borne out. For
reasons explained in Heim 2000, structural ambiguity arising from degree
quantifiers and intensional operators like modals is only visible with non-
upward entailing quantifiers, which is why all the following examples are
with upper-bounded modified numerals.
The example in (20), for instance is ambiguous, with (20a) and (20b) as its
two readings.
(20) (Bill has to read 6 books.) John is required to read fewer than 6 books.
a. ‘John shouldn’t read more than 5 books’

b. ‘The minimal number of books John should read is fewer than 6’
One of the readings of (20) states that there is an upper bound on what John
is allowed to read. The more natural interpretation, however, is a minimality
reading, which is about the minimal number of books John is required to
read. (That is, (20) would, for instance, be true if John meets the requirements
as soon as he reads 3 or more books.)
Following Heim (2000), Hackl analyses this ambiguity as resulting from
alternative scope orderings of the modal and the comparative quantifier. The
upper bound reading, (20a), corresponds to a logical form where the modal
takes wide scope. The minimality reading involves the maximality operator
intrinsic to the comparative construction taking wide scope over the modal
(Heim 2000).
(21) [maxn (∃x[#x = n & book(x) & read(j, x)]) < 6]

[require [ [fewer than 6] [ λn [John read n-many books] ] ] ]
(22) maxn (∃x[#x = n & book(x) & read(j, x)]) < 6
[ [fewer than 6] [ λn [ require [John read n-many books] ] ] ]
A similar structural ambiguity can be observed with existential modals. The

two readings of (23) are an upper bound interpretation as well as a reading
3:8
which is very weak, stating simply that values below the numeral are within
what is permitted, without stating anything about the permissions for higher
values. (That is, the reading intended in (23b) is, for instance, verified by a
situation where there are no restrictions whatsoever on what John is allowed
to read. Clearly, (23a) would be false in such a situation.)
(23) John is allowed to bring fewer than 10 friends.

a. ‘John shouldn’t bring more than 9 friends’
b. ‘It’s OK if John brings 9 or fewer friends (and it might also be OK
if he brings more)’
As before, these readings can be predicted to exist on the basis of the relative
scope of modal and comparative quantifiers.
(24) maxn (♦∃x[#x = n & friend(x) & bring(j, x)]) < 6

[ [fewer than 6] [ λn [ allow [John invite n-many friends] ] ] ]
(25) ♦[maxn (∃x[#x = n & friend(x) & bring(j, x)]) < 6]
[ allow [ [fewer than 6] [ λn [John invite n-many friends] ] ] ]
The reader may check that Hackl’s predicted readings in (24) and (25) are
indeed the attested ones.
3.2 Class B modifiers are different
These analyses are strongly supportive of an approach which treats compar-

ative quantifiers as comparative constructions. The question now is whether
class B quantifiers should be given a similar treatment. In other words, will
the semantics in (26) do?
(26) up to / maximally / at most / etc... 10 =? λM. maxn (M(n)) ≤ 10
Choosing a semantics that is parallel to that of fewer than is partly unintuitive

since the class B quantifiers are not comparative constructions. Yet, cases
like maximally 10 suggest that the crucial ingredient of the semantics is the
same, namely a maximality operator. The unsuitability of the analysis in (26)
becomes immediately apparent, however, if we investigate examples with
class B modified numerals embedded under an existential modal: these turn
out not to be ambiguous (cf. Geurts & Nouwen 2007). Class B modifiers like
maximally, up to and at most always yield an upper bound on what is allowed
and resist the weaker reading that was found with comparative modifiers, as
3:9
R.W.F. Nouwen
the contrast between (27) and (28) makes clear.
(27) John is allowed to bring fewer than 10 friends.

But more is fine too.
(28) John is allowed to bring {up to / at most / maximally} 10 friends.
#But more is fine too.
A further interesting property of the interaction of class B modified numeral

quantifiers and modals is that existential modals interfere with the inferences
about speaker knowledge that we found for simple sentences. Above, I
observed that (29) licenses the inference that the speaker does not know
how many friends Jasper invited. In contrast, (30) does not license any such
inference; it is compatible with the speaker knowing exactly what is and what
is not allowed.
(29) Jasper invited maximally 50 friends.

(30) Jasper is allowed to invite maximally 50 friends.
These observations add to the data separating class A from class B quanti-
fiers. Summarising, the distinctions are then as follows. First of all, class B
quantifiers, but not class A quantifiers, resist definite amounts, except when
embedded under an existential modal. Second, class B quantifiers, but not
class A quantifiers, resist weak readings when embedded under an existential
modal.
In the next section I will argue that the peculiarities of class B quantifiers
can be explained if we assume that they are quite simply maxima and minima
indicators. Basically, what I propose is that the semantics of maximally
(minimally) is simply the operator maxd (mind ). This might be perceived as
stating the obvious. What is not obvious, however, is how such a proposal
accounts for the difference between class A and class B quantifiers. I will
argue that the limited distribution of class B modifiers is due to the fact that
they give rise to readings that are in competition with readings available
for non-modified structures. I will show that, in many circumstances, the
application of a class B modifier to a numeral yields an interpretation which
is equivalent to one that was already available for the bare numeral. Before I
can explain the proposal in detail, I therefore need to include an account of
bare numerals in the framework.
3:10
3.3 The semantics of numerals
Above, I adopted the semantics of Hackl 2001 for comparative modified

numerals. An important part in that framework is played by the counting
quantifier many. I will re-name this operator many1 , for, in what follows, I
assume that for any numeral there are two counting quantifiers available.
These two options are to account for the two meanings of numerals that
may be observed: on the one hand the existential / weak / lower-bounded
meaning and, on the other hand, the doubly bound / strong meaning. An
example like (31), for instance, is ambiguous between (31a) and (31b).
(31) Jasper read 10 books.

a. the number of books read by Jasper ≥ 10
b. the number of books read by Jasper = 10
I assume that, like the meaning in (31a), the meaning in (31b) is semantic and
not the result of a scalar implicature that results from (31a). See e.g. Geurts
2006 for a detailed ambiguity account, and for some compelling arguments
in favour of it.6
In the current framework, that of Hackl 2001, the weak reading in (31a) is
due to a weak semantics for the counting quantifier: i.e. many1 . I propose
that the strong reading, (31b), is accounted for by an alternative quantifier
many2 (taking inspiration from Geurts 2006.)7
(32) many1 = λnλP λQ.∃x[#x = n & P (x) & Q(x)]

many2 = λnλP λQ.∃!x[#x = n & P (x) & Q(x)]
Here, ∃!x[ϕ] abbreviates ∃x[ϕ & ∀x 0 [x 0 6= x → ¬ϕ[x/x 0 ] ]].8 In other words,

∃!x stands for ‘exactly one . . . ’. When x ranges over groups of individuals,
∃!x[#x = n & P (x)] is verified by assigning to x the maximal group of
individuals with property P , where n is the cardinality of that group. This is
because any smaller group will not be the unique group with property P of its
cardinality. For instance, if our domain is {a, b, c, d}, all of which satisfy P ,
then ∃!x[#x = 3 & P (x)] is false, since several groups have three atoms and
property P , among which a ⊕ b ⊕ c and a ⊕ c ⊕ d. However, ∃![#x = 4 & P (x)]
6 But see Breheny 2008 for a dissenting view.
7 Here is a mnemonic. The 1 in many1 represents the fact that this operator is unilaterally
bound, namely lower-bounded only. Many2 on the other hand is bilaterally bound.
8 Here, ϕ[x/x 0 ] is the formula that is exactly like ϕ except that free occurrences of x have
been replaced by x 0 . Moreover, it is assumed that ϕ contains no free occurrences of x 0 .
3:11
R.W.F. Nouwen
is true, since apart from a ⊕ b ⊕ c ⊕ d there is no other group that has 4

atoms while satisfying P . Consequently, ∃!x[#x = n . . .] stands for ‘exactly
n. . . ’. For instance, the doubly bound reading of Jasper read 10 books is (33).
The truth-conditions of (33) are such that it is false if Jasper read fewer than
10 books (for then there would not be 10 books he read), but also false if
Jasper read more than 10 books (for then there would be many groups of 10
books he read).
(33) ∃!x[#x = 10 & book(x) & read(j, x)]
Not only does the option of two counting quantifiers, many1 and many2 ,
suffice to account for the ambiguity of bare numerals, it is moreover harmless
with respect to the semantics of comparative quantifiers. A sentence like
Jasper read more than 10 books is not ambiguous. It is important to show
that the availability of two distinct counting quantifiers does not predict
ambiguities in such examples. It will be instructive to see in somewhat more
detail why this is indeed the case.
The structure in (34) is exemplary of a simple sentence with a modified
numeral object. As explained earlier, the modified numeral applies to the
degree predicate that is created by moving the quantifier out of the DP.
(34) [ MOD n [ λd [ Jasper read d many1/2 books ] ] ]
Now that there is a choice between two counting quantifiers, the denotation
of the degree predicate depends on which of many1 and many2 is chosen. The
predicate in (35) is the result of a structure containing many1 ; the predicate
in (36) is based on many2 . If, in the actual world, Jasper read 10 books, then
(35) denotes {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}. When, however, the predicate contains
the many2 quantifier, the denotation is a singleton set: {10} if Jasper reads
10 books. This is because only the maximal group of books read by Jasper
is such that it is the unique group of that kind of a certain cardinality.
In general, the many2 -based degree predicate extension is a singleton set
containing the maximum of the values in the denotation of the many1 -based
degree predicate.
(35) λd.∃x[#x = d & book(x) & read(j, x)]

(36) λd.∃!x[#x = d & book(x) & read(j, x)]
As discussed above, comparative quantifiers involve maximality operators.

However, the maximal values for degree predicates like (35) and (36) are
3:12
always equivalent. In simple sentences based on a structure like (34), the

option of having two distinct counting quantifiers does therefore not result
in any ambiguity.
When we turn to cases where the degree predicate is formed by moving
the modified numerals over a modal operator with universal force, something
similar can be observed. If Jasper is required to read (exactly) 10 books,
then the structure in (37) yields, again, the set {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}. Once
more, the structure which contains the bilateral counting quantifier, the one
in (38), yields the set containing the maximum of its weaker counterpart.
(37) [ λd [ require [ Jasper read d many1 books ] ] ]

λd.∃x[#x = d & book(x) & read(j, x)]
(38) [ λd [ require [ Jasper read d many2 books ] ] ]
λd.∃!x[#x = d & book(x) & read(j, x)]
Given that the relation between (38) and (37) is once again one of a set and its
maximal value, no ambiguities can be expected to arise when comparative
quantifiers are applied to these two predicates. This is as is desired.
Of course, it could be that the actual situation is not one containing a
specific requirement, but one with for instance a minimality requirement.
Say, for instance, Jasper has to read at least 4 books. In that case, (37) denotes
the set {1, 2, 3, 4}. The extension of (38), however, is the empty set. (In such a
context, there is no specific n such that Jasper has to read exactly n books.)
Clearly, the maximal value for the predicate is undefined in such a case.
This means that the logical form based on many2 will not lead to a sensible
interpretation and, so, we again do not expect to find ambiguity.
The case of predicates that are formed by abstracting over an existential
modal operator is illustrated in (39) and (40). If Jasper is allowed to read a
maximum of 10 books, then the two predicates are equivalent, both denoting
the set {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}.9
(39) λd.♦∃x[#x = d & book(x) & read(j, x)]

(40) λd.♦∃!x[#x = d & book(x) & read(j, x)]
In sum, the option of two counting quantifiers many1 and many2 is irrelevant
when combined with a comparative quantifier. This is because the compara-
9 If there is in addition a lower bound, the two predicates are no longer equivalent, but their
maximum will be.
3:13
R.W.F. Nouwen
tive quantifier is based on maximality and the degree predicates containing

the different counting quantifiers do not differ in their maximum value.
4 The semantics of class B quantifiers
I now turn to the main proposal: class B quantifiers are maxima/minima

indicators. I start with the upper-bounded modifiers.
4.1 Upper bound class B modifiers
In the formula in (41), MOD↓B generalises over any of the class B modifiers at
most, maximally, up to, etc.10
(41) MOD↓B = λd.λM. maxn (M(n)) = d
If the semantics of upper bound class B quantifiers is as in (41), then why is

their distribution so limited? What I think is the reason for the awkwardness
of a lot of examples with class B quantifiers is the fact that, in many cases,
(41) is a vacuous operator. To be precise, the two propostions in (42) are
equivalent whenever the cardinality predicate M denotes a singleton set. In
such a case, a bare numeral form is to be preferred over a numeral modified
by a class B modifier, since the latter derives the same meaning from a much
more complex linguistic form.
(42) a. maxn (M(n)) = d

b. M(d)
What I have in mind exactly is the kind of reasoning underlying Horn’s

division of pragmatic labour (Horn 1984). The idea is that a maxim of brevity,
10 For modifiers like at most and maximally, one might wonder whether (41) is not too restricted,
given that they are capable of modifying DPs more generally. However, it appears that there
is a common mechanism to all uses of such modifiers. For instance, (i) could be assigned its
intuitive meaning if we assume that at most has the semantics in (ii), where the operator
‘max’ compares properties on the rank order [assistant professor < associate professor <
full professor]:
(i) Jasper is at most an associate professor.
(ii) at most = λP .λx.maxP 0 (P 0 (x)) = P
It goes beyond the scope of this article to implement a formal connection between (ii) and
(41), but it should be clear that the underlying mechanism is the same.
3:14
part of Grice’s maxim of Manner (Grice 1975), steers toward minimising the
form used to express something. This causes simple (unmarked) meanings to
be typically expressed by means of simple (unmarked) forms. Marked forms
which by convention could be given the same unmarked meaning as some
unmarked form are instead given a more marked interpretation. There are
many variations and implementations of this idea (McCawley 1978; Atlas &
Levinson 1981; Blutner 2000; van Rooij 2004),11 but what is most relevant for
this paper is the general idea that an unmarked meaning is blocked as an
interpretation for the marked form.
With this in mind, the equivalence of (42a) and (42b) whenever M denotes
a singleton set has profound consequences for when it actually makes sense
to state that the maximum of a degree predicate equals a certain value. That
is, in cases where (42a) equals (42b), we expect that the use of maximally
does not lead to an interpretation based solely on (42a), since the use of the
bare numeral form would result in the same meaning. To illustrate this in
some more detail let us carefully go through the following examples.
We know from the discussion above that one of the interpretations avail-
able for (43) is (44).
(43) Jasper invited 10 people.

(44) ∃!x[#x = 10 & people(x) & invite(x)]
Now consider (45), which is interpreted either as (46) or as (47).
(45) Jasper invited maximally 10 people.

(46) [ maximally 10 [ λd [ Jasper invited d many1 people ] ] ]
maxn (∃x[#x = n & people(x) & invite(j, x)]) = 10
(47) [ maximally 10 [ λd [ Jasper invited d many2 people ] ] ]
maxn (∃!x[#x = n & people(x) & invite(j, x)]) = 10
The interpretations in (46) and (47) are equivalent. In fact, just like we do
not expect ambiguities to arise with comparative quantifiers on the basis
of the many1 /many2 choice, we do not expect any ambiguities to arise with
MOD↓B quantifiers, for the simple reason that both such operators involve
11 In fact, there is a close resemblance between this prevalent idea in pragmatics and blocking
principles in other parts of linguistics. The commonality is that two different expressions
cannot have identical meanings. See, for instance, the Elsewhere Condition (Kiparsky 1973)
in phonology or the Avoid Synonymy principle (Kiparsky 1983) in morphology.
3:15
R.W.F. Nouwen
a maximality operator and that the maximal values of predicates based on

many1 are always those of predicates based on many2 . In what follows,
we will therefore gloss over the two equivalent options by representing the
semantics following the general scheme in (48).
(48) [ maximally 10 [ λd [ Jasper invited d many1/2 people ] ] ]

maxn (∃(!)x[#x = n & people(x) & invite(j, x)]) = 10
Importantly, the single reading of (45) is equivalent to (44), the strong reading
of (43). The example in (43), however, reaches this interpretation by means
of a much simpler linguistic form, one which does not involve a numeral
modifier. I propose that this is why the reading in (48) of (45) does not
surface: it is blocked by (43).12
As observed above, we can nevertheless make sense of (45) once we
interpret the sentence to be about what the speaker holds possible. So, a
further possible reading for (45) is that in (49).
(49) maxn (♦∃(!)x[#x = n & people(x) & invite(j, x)]) = 10
Crucially, this interpretation is not equivalent to (50), which is the result of

interpreting (43) from the perspective of speaker possibility.
(50) ♦∃!x[#x = 10 & people(x) & invite(j, x)]

12 An anonymous reviewer notes two complications with the proposed blocking mechanism.
First of all, s/he wonders why exactly 10 is not blocked in a similar way to minimally 10,
since the same reasoning seems to apply. I acknowledge that this is something that needs to
be explained. Interestingly, this is something any theory that believes in the existence of
an ‘exactly’ sense for numerals has to explain. One promising route has been proposed by
Geurts (2006), who suggests that exactly is semantically empty and that its only function is
“to reduce pragmatic slack” (p. 320). That is, whereas bare 100 allows for an imprecise rough
construal (Krifka 2007a), exactly 100 enforces precision. If Geurts is on the right track, then
there is no reason to expect that exactly 100 is blocked by 100.
A further complication noted by the same anonymous reviewer is that if we assume that
the ‘max’ operator is presuppositional, we might come to expect that maximally 100 blocks
100 instead of the other way around. This prediction appears to be made when at the same
time we assume the Maximize Presupposition principle (Heim 1991). Since maximally 100 and
100 share the same meaning, but the former triggers a presupposition, the use of 100 would
be blocked. This is a very interesting scenario, but since I have little to say about the kind
of presuppositions (if any) expressions like maximally trigger and I furthermore have no
thoughts on how maximize presupposition would interact with a brevity maxim, I will leave
this issue to further research.
3:16
In other words, the meaning in (49) for (45) is not blocked by the bare numeral
form in (43) since (43) lacks this reading.
To be sure, I do not claim that (50) would be an available reading for (43).
That is, the particular kind of interpretation that examples like (45) receive
is available only as a last resort strategy. Underlying this analysis is the as-
sumption that there exist silent modal operators. I can offer no independent
evidence for this assumption, but stress that the intuitions regarding exam-
ples like (45) quite clearly point into the direction of some sort of speaker
modality. In work on superlative quantifiers, we find some alternatives to
the present account. Such approaches are meant to deal with at most and at
least only, but if my arguments above are on the right track, then we could
reinterpret these proposals for the semantics of superlative quantifiers as
applying to the whole of class B. For instance, the analysis of class B expres-
sions presented here differs from that of superlative modifiers in Geurts &
Nouwen 2007. According to the present proposal, the modal flavour of (45) is
due to a silent existential modal operator. In Geurts & Nouwen, however, the
modal was taken to be part of the lexical content of superlative quantifiers.
Another alternative, proposed for superlative modifiers in Krifka 2007b and
which is closer to the present proposal, is to analyse examples like (45) not as
involving a modal operator, but rather a speech act predicate, like assert. In
that framework, the analysis of (45) would say that n=10 is the maximal value
for which ∃(!)x[#x = n & people(x) & invite(j, x)] is assertable, rather than
possible.13 That is, according to Krifka, (45) is interpreted by assigning the
modified numeral scope over an illocutionary force operator, rather than
over a modal operator.
I will return to a comparison of these approaches below. I would like to
point out immediately, however, what I think are the major disadvantages of
both alternatives. The main problem is with examples like (51), which contain
an overt existential modal.
(51) Jasper is allowed to invite maximally/at most 10 people.

13 In his comments on the first version of this paper, David Beaver observed that it it is not
necessarily the speaker’s knowledge that matters, as can be seen from (his) example (i).
(i) I know how many people were at the party, but I’ve been told not to reveal that
number to the press. However, there were maximally 50 there.
It would be interesting to see if data like these help in reaching a synthesis of Krifka’s
account and the present proposal.
3:17
R.W.F. Nouwen
Its most salient reading is one in which 10 is said to be the maximum number
of people Jasper is allowed to invite. That is, it places an upper bound on
what is allowed. For Krifka, this is problematic since, here, the modified
numeral is quite obviously not a speech act operator. For the proposal in
Geurts & Nouwen 2007, such examples are problematic since the modal
lexical semantics of at most predicts a reading with a double modal operator,
one originating from the verb and one from the numeral modifier. To remedy
this, Geurts and Nouwen provide an essentially non-compositional analysis
of such examples as modal concord.14
In contrast, the current proposal deals effortlessly with examples, such
as (51). What was crucial to my explanation of how (45) gets to be interpreted
is that degree predicates based on modals with existential force denote
non-singleton sets even when the counting quantifier associated with the
numeral is many2 . This entails that saying that the maximum value for such
a predicate is n is not equivalent to saying that the predicate holds for n.
More formally, there is a contrast between (52a) and (52b).
(52) a. maxn (∃(!)x[#x = n & people(x) & invite(j, x)]) = 10

a ∃!x[#x = 10 & people(x) & invite(j, x)])
b. maxn (♦∃(!)x[#x = n & people(x) & invite(j, x)]) = 10
i ♦∃!x[#x = 10 & people(x) & invite(j, x)])
As a result, whenever an upper bound class B modifier scopes over an

existential modal, no blocking from the simpler bare numeral form will be
able to take place. The application of an upper bound class B quantifier to a
degree predicate is only felicitous if the resulting readings are not readings
that can be expressed just as well by omitting the class B modifier. This is
the case when a modal with existential force has scope inside the degree
14 A further problem I see with the proposal in Krifka 2007b is that the analysis does not
appear to extend straightforwardly to illocutionary forces other than assertion, although in
fairness this might be because (at the time of writing) no detailed exposition of this theory
exists. For instance, nothing suggests that superlative modified numerals can scope over a
question operator in questions.
An additional disadvantage for the proposal of Geurts and Nouwen is that it does not
yield an explanation of the lexical form of class B modifiers. Whereas the current proposal
assigns to a modifier like maximally the semantics of a maximality operator, an extension of
Geurts and Nouwen’s approach would have to take it to be a modal, thereby disassociating it
from the intuitive meaning of maximal.
3:18
predicate.15 Treating upper bound class B quantifiers as maxima indicators

thus also predicts the absence of weak readings for examples like (51). Given
the flexible scope of the numeral modifier we expect this sentence to have
two corresponding logical forms, (54a) and (54b). (From here on, indicates
deontic modality, to distinguish it from the (epistemic) speaker possibility
♦).
(53) Jasper is allowed to invite maximally/at most 10 people.

(54) a. maxn (∃(!)x[#x = n & people(x) & invite(j, x)]) = 10
b. [maxn (∃(!)x[#x = n & people(x) & invite(j, x)]) = 10]
If maximally 10 is taken to have wide scope over the modal, then we arrive
at (54a), the reading that says that the maximum number of people Jasper
is allowed to invite equals 10. This is not a semantic interpretation that is
available for (55). Its many2 reading, for instance, says that inviting exactly
10 people is something that Jasper is allowed to do. This is much weaker
than (54a). (The only way we can arrive at an equally strong reading for (55)
is by means of implicature.)
(55) Jasper is allowed to invite 10 people.
If we take the modal in (51) to have widest scope, as in (54b), the resulting
interpretation is one in which inviting exactly 10 people is allowed for Jasper.
This is the reading for (55) discussed above, and so it is blocked. As a result,
(54a) is the only interpretation available.
An interesting side to the account presented here is that the upper bound
class B quantifiers do not encode the ≤ relation. As maxima indicators, their
application only makes sense if what they apply to denotes a range of values.
Otherwise, using the strong reading of the bare numeral form will do just as
well.
Interestingly, the approach also predicts that some of the examples I
discussed above do not only result in a blocking effect, but could moreover
be predicted to be false. For instance, according to the approach set out
above, the meaning of (56a) is that in (56b).
15 As far as I can see, assertability would have the same (crucially weak) properties as possibility.
So, should a silent speech act predicate seem more plausible than a silent modal operator,
then ♦ can just as well be interpreted as expressing assertability. It appears that such a
move would be largely compatible with the proposal of Krifka 2007b.
3:19
R.W.F. Nouwen
(56) a. #A triangle has maximally 10 sides.

b. ‘the maximum number of sides in a triangle is 10’
The reading in (56b) is not only blocked by A triangle has 10 sides, but
it is moreover plainly false. I believe that this predicts that (56a) should
be expected to have a somewhat different status from (57), which strictly
speaking has a true interpretation, but one that can be expressed by simpler
means.
(57) #A triangle has maximally 3 sides.
It is difficult to establish whether this difference in status is borne out, or

even how this difference can be recognised. However, my own intuition tells
me that while (56) is never acceptable, (57) could be used in a joking fashion.
Native speakers inform me that (58) is marginally acceptable:
(58) ?A triangle has minimally and maximally 3 sides.
4.2 Lower-bound class B modifiers
Lower-bound class B modifiers correspond to minimality operators. Let MOD↑B

correspond to any of the class B expressions at least, from, minimally, etc.
(59) MOD↑B = λd.λM. minn (M(n)) = d
Note first that minimality operators are sensitive to the many1 / many2
distinction. Consider the degree predicate [λd. John read d many1/2 books]
and, say, that John read 10 books. In the many1 version of the logical form,
the minimal degree equals 1. In fact, independent of how many books John
read, as long as he read books, the minimal degree will always be 1. In the
many2 version of the logical form, the predicate denotes a singleton set, {10}
if John read 10 books. The minimal degree in that case is, of course, 10.
These observations already straightforwardly account for our intuitions
for an example like (60).
(60) John read minimally 10 books.
The many1 interpretation of (60) will be rejected, for it will always be false.
The minimal value for any simple many1 -based degree predicate is always 1.
The many2 interpretation of (60) will be rejected too, for it will correspond
to an interpretation saying that John read (exactly) 10 books. This reading is
3:20
blocked by the bare numeral. (In fact, (60) in the many2 variant is equivalent
to John read maximally 10 books, which, as was explained above, is blocked
for the same reasons.)
We can save (60) by interpreting it with respect to an existential modal
operator. This yields two readings:
(61) a. mind (♦∃x[#x = d & read(j, x) & book(x)]) = 10

b. mind (♦∃!x[#x = d & read(j, x) & book(x)]) = 10
The form in (61a) is once more a contradiction: the minimal degree for which
it is deemed possible that John read d-many1 books is always 1. The reading
in (61b) is much more informative. It says that that the minimal number for
which it is thought possible that John read exactly so many books is 10. In
other words, this says that it is regarded as impossible that John read fewer
than 10 books. This is exactly the reading that is available.
4.3 Beyond modals
Some words are in order on the interaction of numeral modifiers with non-
modal operators. Given the current proposal, any property that involves
existential quantification would license the use of a class B modifier. However,
it is known that degree operators (which we take modified numerals to be)
cannot move to take scope over nominal quantifiers (cf. Kennedy 1997; Heim
2000).16 This explains why (62) does not have the reading in (63).
(62) Someone is allowed to invite maximally 50 friends.

(63) the person who is allowed to invite most friends is allowed to invite
50 friends
As observed above, however, bare plurals do interact with class B quanti-

fiers, as in for instance example (9). This would suggest that some inten-
sional/modal analysis of the readings involved in such examples is in order.
(Thanks to Maribel Romero for pointing this out to me.) I will leave a detailed
analysis of these cases for further research.
16 In Heim’s formulation: If the scope of a quantificational DP contains the trace of a degree
phrase, it also contains that degree phrase itself. See Heim 2000 for details.
3:21
R.W.F. Nouwen
5 Maximal and minimal requirements
As Hackl (2001) observed, there is an interesting interaction between modified

numerals and modals. I have extended these observations by showing how
existential modals have a tight connection to class B modifiers in that they
license their (otherwise blocked) existence. What I have not discussed so far
is how class B modifiers interact with universal modals. It turns out that this
part of the story is not straightforward at all.
Given my proposal in the previous section, we expect that there are in
principle four logical forms that correspond to (64).17
(64) Jasper should read minimally 10 books.

(65) í >min:
The minimum n such that Jasper will read n books should be 10
a. í[minn (∃x[#x = n & book(x) & read(j, x)]) = 10] many1
b. í[minn (∃!x[#x = n & book(x) & read(j, x)]) = 10] many2
(66) min> í:
The minimum n such that Jasper should read n books is 10
a. minn (í∃x[#x = n & book(x) & read(j, x)]) = 10 many1
b. minn (í∃!x[#x = n & book(x) & read(j, x)]) = 10 many2
It turns out that none of these logical forms provide a reading that is
in accordance to our intuitions regarding (64). First of all, notice that
minn (∃x[#x = n & book(x) & read(j, x)]) = 10 is a contradiction. If there
are 10 books that Jasper read, then there is also a singleton group containing
a book Jasper read. The minimum number of books Jasper read is therefore
either 1 (in case he read something) or 0 (in case he did not read anything).
It could never be 10. Consequently, (65a) is a contradiction. For a similar
reason, (66a) is a contradiction too. If there needs to be a group of 10 books
17 In this paper, I ignore readings which (for the case of at least) Büring (2008) calls speaker
insecurity readings and which Geurts & Nouwen (2007) discuss extensively. Basically, this
reading amounts to interpreting the modal statement with respect to speaker’s knowledge.
Such readings are especially prominent with superlative quantifiers. For instance, the speaker
insecurity reading of Jasper should read at least 10 books is: the speaker knows that there is
a lower bound on the number of books that Jasper should read, s/he does not know what
that lower bound is, but she does know that it exceeds 9.
Furthermore, I also ignore a reading of (64) in which 10 books is construed as a specific
indefinite. In that reading, (64) states that there are 10 specific books such that only if Jasper
reads these books will he comply with what is minimally required.
3:22
read by Jasper, then there also need to exist groups containing just a single
book read by Jasper. Once again, the minimum number referred to in (66a)
is either 0 or 1, never 10.
Turning to (65b), notice that the minn -operator is vacuous here, since
there is just a single n such that Jasper read exactly n books. This renders
(65b) equivalent to the many2 reading of Jasper should read 10 books, and
so we predict it to be blocked. The interpretation in (66b) does not fare any
better. In fact, the minn -operator is vacuous here as well. This means that
(65b) is equivalent to (66b) and that it is consequently also blocked. Even if
no blocking were to take place, (65b)/(66b) offer the wrong interpretation
anyway. They state that Jasper must read exactly 10 books (no more, no
fewer), which is not what (64) means.
One might think that the problems with (65b) and (66b) can be remedied
by abandoning quantification over sums and instead using reference to
(maximal) sums. For instance, (67) represents the truth-conditions we are
after. (Here σx returns the maximal sum that when assigned to x verifies the
scope of σ ).
(67) minn (í[#σx (book(x) & read(j, x)) ≥ n]) = 10
Still, here too the application of minn is not meaningful, since there is only
a single n such that í[#σx (book(x) & read(j, x)) ≥ n] holds, which is 10
if (64) is true. As a consequence, it would not matter whether we applied
a maximality or a minimality operator. We then wrongly predict that (68)
should share a reading with (64). (Note that (65b) and (66b) suffer from the
same odd prediction, given that the operator minn has no semantic impact
there either.)
(68) Jasper should read maximally 10 books.
It appears then that the proposal defended in this article fails hopelessly on
sentences like (64). As I will show, however, things are not so dire as they
appear. In fact, I will argue that what we stumble upon here is a general,
but poorly understood property of modals, which could be summarised as
follows:
(69) Generalisation: universal modal operators are interpreted as opera-

tors with existential modal force when minimality is a stake
An illustration of (69) is (70), which is a satisfactory paraphrase for (64).
3:23
R.W.F. Nouwen
What is striking is that this paraphrase contains allow instead of should.
(70) 10 is the smallest number of books John is allowed to read
I will not offer an explanation for this generalisation (but see Nouwen 2010a
for an attempt). I will simply show that if we look a bit closer at the inter-
pretation of modal operators, then we come to understand that my theory
actually yields a welcome analysis.
5.1 Previous analyses
There is a precedent. In an earlier theory of at least, Geurts & Nouwen 2007,

the correct predictions regarding its relation to universal modals are arrived
at by an essentially non-compositional mechanism. A central claim made in
that paper is that superlative quantifiers are modal expressions themselves.
For instance, (71a) was proposed to correspond to (71b).18 Furthermore, it
was assumed that there may be a non-compositional interaction between the
modal that is implicitly contributed by a modified numeral and an explicit
modal operator. For instance, (72a) is interpreted as an instance of modal
concord, as in (72b), where the two modals fuse and the modal takes on the
deontic flavour of need.19
(71) a. John read at least 10 books.

b. ∃x[#x = 10 & book(x)& read(j, x)]
(72) a. John needs to read at least 10 books.
b. í∃x[#x = 10 & book(x)& read(j, x)]
18 This is how I see the theoretical landscape: Although not immediately obvious, the proposal
by Geurts and Nouwen already carries in it the idea that superlative quantifiers are minimality
and maximality operators. For instance, (71b) is equivalent to stating that 10 is the minimal
number of books John is allowed to read. Given the basic idea of treating class B operators
as min/max-operators, one has a range of options to account for the distribution of such
quantifiers and for their behaviour in intensional contexts. Geurts and Nouwen represent
one extreme, where the lexicon specifies the exact behaviour of such quantifiers (together
with the rule of modal concord). The present proposal puts forward the other extreme,
where the lexical entry for superlative (and other class B) quantifiers is rather minimal, and
where pragmatic mechanisms account for distribution and behaviour in intensional contexts.
19 I am simplifying the analysis here a little bit. Geurts & Nouwen (2007) propose that
there is an additional conjunct to the meaning of sentences containing superlative quan-
tifiers, for which they leave implicit whether it is entailed or implicated. For (71), for
instance, there would be an additional condition in the truth-conditions saying: ¬∃x[#x >
10& book(x) & read(j, x)]. Similarly for (72).
3:24
The approach of Geurts and Nouwen is the most broadly applicable approach
to superlative quantifiers in the (admittedly small body of) literature on that
topic. There are alternatives on the market, but they do not handle examples
like these very well. As I mentioned above, Krifka (2007b) takes at least to
be a speech act modifier. Basically, an example like (71) is analysed by Krifka
in terms of what the speaker finds assertable and is paraphrased as follows:
the lowest n such that it is assertable that John read n books is 10. When
at least is embedded in an intensional context, however, it does not modify
the strength of assertability, but rather the intensional operator. So, taking
Krifka’s analysis as suitable not just for superlative, but rather for all class B
quantifiers, (72a) would be paraphrased as (73).
(73) 10 is the smallest value for n such that John should read n books
In such cases, Krifka’s analysis is identical to the one I have set out above
and it runs in exactly the same problem: (73) is not the reading we are after.
Rather, (72a) means that 10 is the smallest number of books John is allowed
to read.
5.2 Minimal requirements
Geurts & Nouwen (2007) and Krifka (2007b) say nothing about the distinc-
tion between class A and class B expressions. However, if we extend their
proposals for superlative quantifiers to cover all B-type quantifiers, then we
have an interesting trio of competing characterisations of such expressions.
At face value, the observations made so far in this section would appear to
speak in favour of the modal concord proposal of Geurts & Nouwen (2007)
(generalised to all class B quantifiers) and against the account defended here
or in Krifka 2007b. As I will argue now, however, there are reasons to believe
that the problematic predictions made by the latter two theories are not due
to the semantics of the modified numeral, but are actually the result of an
overly simplistic understanding of requirements. What I will do is discuss in
some detail examples like (74).
(74) The minimum number of books John needs to read to please his
mother is 10.
Notice, first of all, that on an intuitive level, (74) is equivalent to (75).
(75) John needs to read minimally 10 books to please his mother.
3:25
R.W.F. Nouwen
Note, secondly, that (74) spells out the semantics I have proposed for (75).
What I will show now is that when we look into the semantic details of (74),
we will run into exactly the same problems as we did for (75). What this
shows is that rather than thinking that my account of class B quantifiers is
on the wrong track, there are actually reasons to believe that the proposal
lays bare a hitherto unexplored problem for the semantics of modals like
need, require, etc.
Let us consider the semantics of (74). Say that, in fact, the minimal
requirements for pleasing John’s mother are indeed John reading 10 books.
That is, if John reads 10 or more books, she is happy. If he reads fewer,
she will not be pleased. Standard accounts of goal-directed modality (von
Fintel & Iatridou 2005) assume that statements of the form to q, need to p
are true if and only if p holds in all worlds in which the goal q holds. Below, I
refer to the worlds in which John pleases his mother as the goal worlds. It is
instructive to see what we know about the propositions that are true in such
worlds. The following is consistent with the context described above.
(76) a. In all goal worlds: ∃x[#x = 10 & book(x) & read(j, x)]
b. In all goal worlds: ∃x[#x = 9 & book(x) & read(j, x)]
c. In all goal worlds: ∃x[#x = 1 & book(x) & read(j, x)]
d. In some (not all) goal worlds: ∃x[#x = 11 & book(x) & read(j, x)]
e. In some (not all) goal worlds: ∃x[#x = 12 & book(x) & read(j, x)]
f. In no goal world: ¬∃x[book(x) & read(j, x)]
Let us now analyse some examples. First of all, (77a) and (77b) are intuitively
true and are also predicted to be true ((77a) by virtue of (76a) and (77b) by
virtue of (76c).)
(77) a. To please his mother, John needs to read 10 books.

b. To please his mother, John needs to read a book.
The example in (78) is intuitively false, and is also predicted to be false, for
the context is such that there are goal worlds in which John reads only 10,
and not 11, books.
(78) To please his mother, John needs to read 11 books.
So far, so good. If we turn to examples that place a bound on what is required,

however, then the theory makes a wrong prediction. The example in (79) is
intuitively false. If interpreted as (80), however, it is predicted to be true (by
3:26
virtue of (76c)).
(79) The minimum number of books John needs to read, to please his
mother, is 1.
(80) minn [In all goal worlds: ∃x[#x = n & book(x) & read(j, x)]] = 1
In general, theories such as that of von Fintel & Iatridou (2005) predict that
if S is an entailment scale of propositions, and p is a proposition on this
scale, then if p is a minimal requirement for some goal proposition q, then
a statement of the form “the minimum requirement to q is p” is always
predicted to be false, except when p is the minimal proposition of S. This
makes a devastating prediction, namely that minimal requirements could
never be expressed, since they would always correspond to the absolute
minimum.
One might think that what is going wrong in the example above is that I
assume that when we talk about how many books John read we should be
talking about existential sentences, that is about at least how many books
John read. The alternative would be to describe the number of books John
read by means of the counting quantifier many 2 , that is, how many books
John read exactly. I’m afraid this only makes the problem worse. Here is a
description of the relevant context in terms of the exact number of books
that were read by John.
(81) a. In some but not all goal worlds: John read exactly 10 books.
b. In no goal world: John read exactly 9 books.
c. In no goal world: John read exactly 1 book.
d. In some but not all goal worlds: John read exactly 11 books.
e. In some but not all goal worlds: John read exactly 12 books.
Now, there is no number n such that John read exactly n books in all goal
worlds. So, the smallest number of books John needs to read does not refer.
The upshot is that there is no satisfactory analysis of examples like (74)
under the assumptions made here. In general, it seems that, under standard
assumptions, there is no satisfactory analysis of minimal requirements.
Whatever way we find to fix the semantics of cases like (74), however, this fix
will work to save the account of class B quantifiers too, for (74) was a literal
spell-out of the proposed interpretation of similar sentences with at least,
minimally, etc. It goes beyond the scope of this article to provide such a fix.
The overview in (81), however, can help to indicate where we should look for
3:27
R.W.F. Nouwen
a solution.20 Given that there is no goal world in which John read exactly
n books for n’s smaller than 10, it follows that 10 is the minimal number
of books John could read to please his mother. In other words, examples
like (74) show that, in the scope of a minimality operator, modals that are
lexically universal quantifiers get a weaker interpretation.
That said, it is time to revisit example (64), repeated here as (82).
(82) Jasper should read minimally 10 books.
My proposal generated four logical forms, two of which were contradictory

and two of which were blocked by a non-modified form. Let us revisit one of
these logical forms, namely the one with a narrow scope modal and a doubly
bound counting quantifier, represented in (83). The resulting truth-conditions
were presented above as (84).
(83) [ minimally 10 λn [ should [ Jasper read [ n-many2 books ] ] ] ]

(84) minn (í∃!x[#x = n & book(x) & read(j, x)]) = 10
What the discussion in the current section suggests is that it is a misunder-

standing to assume that (83) is interpreted as (84), and that it looks like there
is a mapping to a form like (85), instead.
(85) minn (∃!x[#x = n & book(x) & read(j, x)]) = 10
This captures the intuitive meaning of (82).

At this point I do not have anything to offer which provides the mechanism
behind the generalisation that the combination of a universal modal and a
minimality operator leads to a semantics which is existential in nature. What
is relevant for the present purposes is that this is a general phenomenon.
Interestingly, this means there are noteworthy connections to other areas
where the semantics of a modal statement appear mysterious. Schwager
(2005), for instance, notices that certain imperatives, which are standardly
considered to have universal modal force, require a weaker semantics. Her
key examples are German imperatives containing for example.
(86) Q: How can I save money?

A: Kauf zum Beispiel keine Zigaretten!
Buy for instance no cigarettes
“For example, don’t buy any cigarettes!”
20 See Nouwen 2010a for a proposal along these lines.
3:28
In the context of the question asked in (86), the imperative does not convey
that to comply with the advice, the hearer has to stop buying cigarettes.
Instead, it is interpreted as stating that one of the things one could do to
save money is to stop buying cigarettes. Thus, examples like these display
a mechanism that is similar to the interaction of numeral modifiers and
modality.
The mysterious interaction of modified numerals and modals is moreover
reminiscent of the interaction of modals and disjunction (Zimmermann 2000;
Geurts 2005; Aloni 2007), especially since, on an intuitive level at least,
a class B modified numeral like minimally 10 (and, quite obviously, 10 or
more) appears to correspond to a disjunction of alternative cardinalities,
with 10 as the minimal disjunct.21 A central issue in the literature on modals
and disjunction is that classical semantic assumptions fail to capture the
entailments of sentences where a disjunctive statement is embedded under a
modal operator (Kamp 1973). A detailed comparison of this complex issue
with the discussion of minimal requirements that I presented here, however,
will be left to further research.
6 More about the A/B distinction
In this section, I will attempt to give some initial answers to three empirical
questions concerning the distinction between class A and B modified nu-
merals that is central to this article. First of all, I turn to the issue of which
expressions go with which class. So far, I have restricted my attention mostly
to, on the one hand, comparative quantifiers (as proto-typical class A expres-
sions) and, on the other hand, superlative, minimality/maximality and up
to-modified numerals (as representatives of class B). What about expressions
like the prepositional over n or under n or the double bound between n and
m or from n to m? Below, I will turn briefly to such expressions.
A second empirical question concerns the validity of the examples used
so far. Although I believe that the intuitions concerning the constructed
examples in this article are rather clear, my plea for two kinds of modified
numerals would still benefit from some independent objective support.
Below, I present the results of a small corpus study that clearly reflects the
distinction argued for in this article.
Finally, this section will turn to the cross-linguistic generality of the
21 See Nilsen 2007 and Büring 2008 for suggestions along this line for the modifier at least
only.
3:29
R.W.F. Nouwen
proposal. I will provide data from a more or less random set of languages that
suggest that the class A/B distinction is not a quirk of English or Germanic,
or even Indo-European, but is, in fact, quite general.
6.1 Filling in class A and B
I will leave it an open question exactly which quantifiers belong to which class.
Nevertheless, I can already offer some speculations on several quantifiers
that I have so far not discussed. To start with disjunctive quantifiers, it
appears that these are clear cases of class B expressions.
(87) a. #A triangle has 3 or more sides.

b. #A triangle has 3 or fewer sides.
With disjunctive quantifiers in class B, one might wonder whether there are
any examples of class A expressions which are not the familiar comparative
quantifiers more/fewer/less than n. I think that locative prepositional modi-
fiers are a likely candidate for class A membership, however. In fact, I believe
that the locative/directional distinction in spatial prepositions corresponds
to the class A/B distinction when these prepositions are used as numeral
modifiers.
Roughly, locative prepositions express the location of an object and are
compatible with the absence of directionality or motion. Directional prepo-
sitions, on the other hand, cannot be used as mere indicators of location.
(88) Locative:
a. John was standing under a tree.
b. That cloud is hanging over San Francisco.
c. Breukelen is located between Utrecht and Amsterdam.
(89) Directional:
a. #John was standing up to here.
b. #John was standing from here.
c. #Breukelen is located from Utrecht to Amsterdam.
Now, compare (90a) and (90b).
(90) a. You can get a car for under €1000.

b. You can get a car for maximally €1000.
3:30
The example in (90b) is somewhat strange, since it claims that the most
expensive car you can buy is €1000. The example in (89a), in contrast, makes
no such claim. It clearly has a weak reading: there are cars that are cheaper
than €1000 and there might be more expensive ones too. As explained above,
such weak readings are typical for class A quantifiers and do not occur with
class B quantifiers.22 Furthermore, under seems perfectly compatible with
definite amounts, such as in (91).
(91) The total number of guests is under 100. To be precise, it’s 87.
Class A is then not restricted to comparative constructions only. In fact,

other locative prepositions seem to behave similarly to under.
(92) The total number of guests is between 100 and 150. It’s 122.
The locative complex preposition between . . . and . . . contrasts with its

directional counterpart from . . . (up) to . . . , which behaves like a class B
modifier: it is incompatible with definite amounts, as in (93), but felicitous if
it relates to a range of values.
(93) #The ticket to the Stevie Wonder concert that I bought yesterday cost
from €100 to €800.
(94) Tickets to the Stevie Wonder concert cost from €100 to €800.
It appears then that locative prepositions turn into class A modifiers, while
directional ones turn into class B modifiers. A potential counterexample,
however, is over, which apart from a (relatively rarely used) locative sense, as
in (88b), has a directional sense, such as exemplified in (95).
(95) The bird flew over the bridge.
As a numeral modifier, however, over looks like a class A element. In (96),

over 100 is clearly relating the precise weight 104kg with 100kg. Note in
(97) how this contrasts with the directional 100 . . . and up, which is made
22 An anonymous reviewer notes a complication. It appears that under cannot take wide scope
with respect to a modal. That is, it fails to display scope ambiguities such as the one in (20)
above. For instance, (i) (which is an example given by the reviewer) is odd, since it misses an
interpretation where the modified numeral has scope over require.
(i) #John is required to come up with under 6 brilliant ideas.
3:31
R.W.F. Nouwen
felicitous by embedding it under an existential modal.
(96) He weighs over 100 kg. To be precise, he weighs 104 kg.

(97) a. #He weighs 100 kg and up.
b. He is allowed to weigh 100 kg and up.
A potential explanation for why the numeral modifier over lacks a direc-
tional/class B sense23 is that the use of prepositions in numeral quantifiers is
restricted to prepositions that are vertically oriented. This is connected to
the observation of Lakoff & Johnson 1980 that cardinality is metaphorically
vertical: more is higher (as in a high number), less is lower (as in a low
number). Prepositions in modified numerals follow this metaphor.24 What
is interesting about over, however, is that only its locative sense is vertical.
Its directional sense, as in (95), rather expresses a mainly horizontal motion.
This could explain why there is no class B sense numeral modifier over.
Further clues that this analysis is on the right track come from Dutch,
where the preposition over lacks a locative sense.
(98) #De wolk hangt over San Francisco.

The cloud hangs over San Francisco.
(99) De vogel vloog over de brug.
The bird flew over the bridge.
Instead of over in (98), boven (above) should be used for locative meanings.
(100) De wolk hangt boven San Francisco.

The cloud hangs above San Francisco.
‘The cloud hangs over San Francisco.’
In Dutch, only boven can modify numerals. Over, which lacks a vertical sense,
is unacceptable in modified numerals.
(101) Inflatie kan {boven / #over} de 10% zijn.

Inflation can {above / over} the 10% be.
‘Inflation can be over 10%’
23 Thanks to Joost Zwarts for discussing this matter with me.
24 Up (to) and under are clearly vertical. Between and from . . . to are compatible with all possible
axes.
3:32
I will refrain from attempting to offer further evidence for my suggestion

that there is a correspondence between the locative/directional and the A/B
distinction. In any case, it should be clear that the set of prepositional
quantifiers offers an interesting range of contrasts that support the existence
of two classes of modified numerals.
To summarise this subsection, I tentatively put forward the following
classification for English modified numerals.
(102) Class A
(Positive:) more than —, over —
(Negative:) fewer than —, less than —, under —
(Neutral:) between — and —
(103) Class B
(Positive:) at least —, minimally —, from — (up), — or more
(Negative:) at most —, maximally — , up to —, — or fewer, — or
less
(Neutral:) from — and —
Missing from this classification are the negative comparative quantifiers like
no more/fewer than 10. The reason for this is that the occurrence of negation
complicates the comparison with other quantifiers. In fact, I think that such
quantifiers are best treated as the compositional combination of a class A
comparative modifier with a negative differential no. See Nouwen 2008b for
the consequences of such a move and for more details on the interpretations
available for sentences containing such quantifiers.
6.2 Support for the A/B distinction from a corpus study
I now turn to a small corpus study I conducted which supports the division
between class A and class B modifiers. Recall that one of the central obser-
vations in favour of the distinction connected to contrasts such as (104).
Whereas (104a) can be interpreted with respect to a definite actual number
of people invited by Jasper, (104b) does not allow such an interpretation and
instead is evaluated in relation to what the speaker holds possible.
(104) a. Jasper invited fewer than 100 people. 87, to be precise.

b. Jasper invited maximally 100 people. #87, to be precise.
3:33
R.W.F. Nouwen
I explained this contrast by proposing that upper bound class B quantifiers

are indicators of maxima. The indication of the maximum of a single value
leads to infelicity. Existential modals, however, introduce a range of (possible)
values, which thereby license the application of the maxima indicator. For
examples like (104b), where no overt modal is present, the hearer will have
to accommodate an interpretation with respect to speaker possibility. Given
that ♦-modals licenses the application of an upper bound class B modifier,
one would expect, however, that class B modifiers co-occur with an overt
modal operator relatively often. I conducted a corpus study to find out
whether this expectation is fulfilled.
6.2.1 Method
I used the free service for searching the Corpus of Contemporary American
English (COCA, 385 million words, a mix of fiction, science, newspaper and
entertainment texts and spoken word transcripts) at americancorpus.org
(Davies 2008). For each numeral modifier I took 100 quasi-random25 occur-
rences of the modifier with a numeral. For each of these cases I examined
whether the modified numeral was in the scope of an explicit existential
modal operator (such as can, could, might, possibly, allow, etc.) In other
words, I only looked at the surface form and only counted the number of
cases where a modal expression has a scope relation with a modified numeral.
Given the theory presented in this article, the prediction is that this number
is significantly higher with class B numerals than with class A expressions.
I compared five modifiers: fewer than, under, between, at most and up
to. Not all occurrences of these modifiers with a numeral in the corpus were
taken into consideration. For instance, (105) was ignored because in this
example up to is probably not a constituent.26 That is, this example contains
the particle verb to lift up, rather than the verb to lift.
(105) Periodically we’d lift up to 60 kilometers where the temperatures

and pressures are more like Earth’s.
I similarly disregarded occurrences of under n where under is a regular

preposition rather than a preposition in a role of numeral modifier. (For
instance, examples resembling He was known under 2 different names.)
25 ‘Quasi’, since the results are given in chronological order and I would just take the earliest
hits.
26 From: “To boldly go. . . ”, Donald Robertson (1994), Astronomy, Vol. 22, Iss. 12; pg. 34, 8 pgs.
3:34
6.2.2 Results
The results, summarised in the table in (106), support the proposal in this
article. Here, P is the percentage of occurrences within a existential modal
context, within a sample of 100 occurrences of that modifer.27
(106) Class A Class B

fewer than under between at most up to
P 4% 3% 4% 23% 21%
The corpus thus shows a clear preference for combining class B quantifiers
with existential modal operators, as was predicted.28 Whether the data are as
clear as (106) for other expressions too remains to be seen. It will be difficult
to extend this type of study to other modifiers. Maximally and from. . . to,
for instance, were included in the present corpus search, but did not yield
enough occurrences to make a meaningful comparison.
6.3 The cross-linguistic generality of the distinction
The class A/B distinction is not a peculiarity of the English language. I will
suggest in this subsection that, in fact, the distinction is quite general and
that languages seem to fill in the two classes in roughly the same way. Dutch,
for instance, mirrors the English data perfectly. To illustrate, (107) and (108)
shows the A/B distinction in a contrast between comparative and superlative
quantifiers.
(107) Een driehoek heeft meer dan 1 zijde.

A triangle has more than 1 side.
(108) #Een driehoek heeft minstens 2 zijdes.
A triangle has at least 2 sides.
There are similar contrasts for other numeral modifiers. In a nutshell, the
Dutch data suggests the two classes in (109), which is parallel to English.
27 I also counted the number of occurrences in a universal modal context. As would be

predicted, this yielded no significant difference between class A and class B modifiers. For
all modified numerals, this number was between 1 and 5.
28 The contrast between the Class A and Class B data is significant (χ 2 =41.2, df=1, p =
1.375×10−10 .)
3:35
R.W.F. Nouwen
(109) Dutch Class A

(Positive:) meer dan — (more than), boven de — (above the)
(Negative:) minder dan — (fewer/less than), onder de — (under the)
(Neutral:) tussen de — en de — (between the. . . and. . . )
(110) Dutch Class B
(Positive:) ten minste —, minstens —, op z’n minst — (at least), vanaf
— (from off), zeker — (certain), minimaal — (minimal)
(Negative:) ten hoogste —, hoogstens —, op z’n hoogst — (at most),tot
— (up to), maximaal — (maximal)
(Neutral:) van — tot — (from — to —)
In other languages, we find similar data. For instance, the division between
comparative and superlative modifiers appears to be cross-linguistically quite
general. In Italian, for instance, the following contrast exists.
(111) Un triangolo ha piú di 1 lato.

A triangle has more than 1 side.
(112) #Un triangolo ha almeno 2 lati.
A triangle has at least 2 sides.
In Chinese, there also exists a superlative form that behaves like a class B
modifier.
(113) #Sanjiaoxing zui-shao you liang-tiao bian.

triangle most-little have 2-CL side
On the other hand, there also exists an alternative form resembling English
at least, which behaves differently. The form zhi-shao can be used as in a
similar way as English at least is in sentences like At least it doesn’t rain!.
Despite this parallel to the English superlative modifiers, the example in (114)
appears to be fine, which suggests zhi-shao is of type A.
(114) Sanjiaoxing zhi-shao you liang-tiao bian.

triangles to-little have 2-CL side
I leave a more detailed investigation of such data for further research. What-
ever the outcome, however, the data first and foremost reveal that the type
of contrasts that have been the central focus of this paper occur in Chinese
and that, thereby, Chinese also appears to have the class A/B distinction.
3:36
Above, I suggested that prepositional numeral modifiers are to be divided

in two classes in accordance with the locative/directional distinction that
exists for their spatial meanings. The clearest case of a class B directional
prepositional modifier in English is up to. In many other languages, one
and the same particle is used for indicating spatial, numerical and temporal
extremes. (In English, up to cannot be used as a temporal operator, for
which until exists.) In Dutch, for instance, the preposition tot has these
three functions. Crucially, in all these three domains tot displays class B
characteristics.
(115) #Een driehoek heeft tot 10 zijdes.

A triangle has up to/until 10 sides.
(116) #Je auto stond tot hier geparkeerd.
Your car stood up to/until here parked.
‘#Your car was parked up to here’
(117) Je auto mag tot hier geparkeerd worden.
Your car may up to/until here parked be.
‘You may park your car up to here’
(118) #Jasper kwam tot middernacht de kamer binnengelopen.
J. came up to/until midnight the room inside-walked.
‘#J. entered the room until midnight’
(119) Jasper mag tot middernacht de kamer binnen komen
J. may up to/until midnight the room inside come
lopen.
walk.
‘J. is allowed to enter the room until midnight’
Similar data exist for German bis (zu), Hebrew ’ad, Catalan fins a, Spanish
hasta and Italian fino a. In fact, in Italian it appears that (120) is generally
awkward, resisting a reading that connects to speaker’s possibility. However,
it becomes acceptable if an overt modal verb is inserted.
(120) ??John ha invitato {al massimo / fino a} 50 amici.

John has invited {at most / until} 50 friends.
(121) John può invitare {al massimo / fino a} 50 amici.
John can invite {at most / until} 50 friends.
3:37
R.W.F. Nouwen
7 Conclusion
The central aim of this article has been to put forward the empirical ob-
servation that numeral modifiers come in two classes: those that relate to
definite amounts (class A) and those that resist association with definite
cardinality (class B). Theoretically, I proposed that underlying this distinction
is a difference in the kind of relations numeral modifiers encode: either a
simple comparison relation between numbers (class A) or a relation between
a range of values and its minimum or maximum (class B). I furthermore
showed how this theory can be implemented in a framework where numeral
modifiers are treated as degree quantifiers.
While there already existed analyses of both type A and type B modifiers,
the class difference that was the central focus of this article has not yet been
discussed. For the treatment of class A quantifiers in this article I adopted
the proposal of Hackl 2001. My account of class B modifiers, on the other
hand, is original. It can be compared to two closely related proposals on the
semantics of superlative modifiers: Geurts & Nouwen 2007, where superlative
modified numerals are proposed to lexically specify modal operators, and
Krifka 2007b, where superlative quantifiers are proposed to be speech act
modifiers. Both works do not discuss the class A/B distinction, but I take it
that both these proposals, in view of the main observations of this article,
can be viewed as accounts not just of superlative quantifiers, but of class
B members in general. As suggested in section 5, my proposal is in certain
respects quite close to Krifka’s. It differs greatly, however, from Geurts &
Nouwen 2007 in the way the interaction between modified numerals and
modality is accounted for. In a way, the current article as well as Krifka
2007b represent a position where quantifiers lexically specify quite minimal
functions, which consequently leads to much of the work being done by
pragmatic mechanisms (such as blocking). For the proposal in Geurts &
Nouwen 2007, on the other hand, the balance is different in that a much
greater burden is placed on semantics. An in-depth comparison of these
accounts of class B quantifiers, however, is left for further research.
References
Aloni, Maria. 2007. Free choice, modals, and imperatives. Natural Language
Semantics 15(1). 65–94. doi:10.1007/s11050-007-9010-2.
Atlas, Jay David & Stephen C. Levinson. 1981. It-clefts, informativeness, and
3:38
logical form: Radical pragmatics (revised standard version). In Peter Cole

(ed.), Radical pragmatics, 1–61. New York: Academic Press.
Barwise, John & Robin Cooper. 1981. Generalized quantifiers and natural lan-
guage. Linguistics and Philosophy 4(2). 159–219. doi:10.1007/BF00350139.
Blutner, Reinhard. 2000. Some aspects of optimality in natural language in-
terpretation. Journal of Semantics 17(3). 189–216. doi:10.1093/jos/17.3.189.
Breheny, Richard. 2008. A new look at the semantics and pragmatics of
numerically quantified noun phrases. Journal of Semantics 25(2). 93–140.
doi:10.1093/jos/ffm016.
Büring, Daniel. 2008. The least at least can do. In Charles B. Chang &
Hannah J. Haynie (eds.), Proceedings of WCCFL 26, 114–120. Somerville,
Massachusetts: Cascadilla Press.
Corblin, Francis. 2007. Existence, maximality and the semantics of numeral
modifiers. In Ileana Comorovski & Klaus von Heusinger (eds.), Existence:
Semantics and syntax (Studies in Linguistics and Philosophy 84), Springer.
Corver, Norbert & Joost Zwarts. 2006. Prepositional numerals. Lingua 116(6).
811–836. doi:10.1016/j.lingua.2005.03.008.
Davies, Mark. 2008. The corpus of contemporary American English (COCA):
385 million words, 1990-present. Available online at http://www.
americancorpus.org.
von Fintel, Kai & Sabine Iatridou. 2005. What to do if you want to go to
Harlem: Anankastic conditionals and related matters. Ms. MIT, available
on http://mit.edu/fintel/www/harlem-rutgers.pdf.
Geurts, Bart. 2005. Entertaining alternatives: disjunctions as modals. Natural
Language Semantics 13(4). 383–410. doi:10.1007/s11050-005-2052-4.
Geurts, Bart. 2006. Take five: the meaning and use of a number word.
In Svetlana Vogeleer & Liliane Tasmowski (eds.), Non-definiteness and
plurality, 311–329. Amsterdam/Philadelphia: Benjamins. Pre-published
version available at http://ncs.ruhosting.nl/bart/papers/five.pdf.
Geurts, Bart & Rick Nouwen. 2007. At least et al.: the semantics of scalar
modifiers. Language 83(3). 533–559.
Grice, Paul. 1975. Logic and conversation. In Peter Cole & Jerry L. Morgan
(eds.), Syntax and semantics 3: Speech acts, 41–58. New York: Academic
Press.
Hackl, Martin. 2001. Comparative quantifiers: Department of Linguistics
and Philosophy, Massachusetts Institute of Technology dissertation.
doi:1721.1/8765.
Heim, Irene. 1991. Artikel und Definitheit. In Arnim von Stechow & Dieter
3:39
R.W.F. Nouwen
Wunderlich (eds.), Semantik: Ein internationales Handbuch der zeitgenös-

sischen Forschung, Berlin: de Gruyter.
Heim, Irene. 2000. Degree operators and scope. In Proceedings of SALT 10,
Ithaca, NY: CLC Publications.
Horn, Laurence R. 1984. Toward a new taxonomy for pragmatic inference:
Q-based and R-based implicature. In Deborah Schiffrin (ed.), Meaning,
form and use in context, 11–42. Washinton: Georgetown University Press.
Kamp, Hans. 1973. Free choice permission. Proceedings of the Aristotelian
Society 74. 57–74.
Kennedy, Christopher. 1997. Projecting the adjective: the syntax and semantics
of gradability and comparison: UCSD PhD. Thesis.
Kiparsky, Paul. 1973. "Elsewhere" in phonology. In Stephen R. Anderson &
Paul Kiparsky (eds.), A festschrift for Morris Halle, 93–106. New York: Holt,
Reinhart, & Winston.
Kiparsky, Paul. 1983. Word formation and the lexicon. In Proceedings of
the 1982 Mid-America Linguistics Conference, 47–78. Lawrence, Kansas:
University of Kansas.
Krifka, Manfred. 1999. At least some determiners aren’t determiners. In Ken
Turner (ed.), The semantics/pragmatics interface from different points of
view vol. 1, 257–291. Elsevier.
Krifka, Manfred. 2007a. Approximate interpretation of number words: A case
for strategic communication. In Irene Vogel & Joost Zwarts (ed.), Cognitive
foundations of communication, Amsterdam: Koninklijke Nederlandse
Akademie van Wetenschapen.
Krifka, Manfred. 2007b. More on the difference between more than two and
at least three. Paper presented at University of California at Santa Cruz,
available at http://amor.rz.hu-berlin.de/~h2816i3x/Talks/SantaCruz2007.
pdf.
Lakoff, George & Mark Johnson. 1980. Metaphors we live by. University of
Chicago Press.
McCawley, James. 1978. Conversational implicature and the lexicon. In Peter
Cole (ed.), Syntax and semantics 9: Pragmatics, New York: Academic Press.
Nilsen, Øystein. 2007. At least: Free choice and lowest utility. Paper presented
at ESSLLI workshop on quantifier modification.
Nouwen, Rick. 2008a. Directionality in modified numerals: the case of up to.
Semantics and Linguistic Theory 18. doi:1813/13056.
Nouwen, Rick. 2008b. Upper-bounded no more: the implicatures of
negative comparison. Natural Language Semantics 16(4). 271–295.
3:40
doi:10.1007/s11050-008-9034-2.
Nouwen, Rick. 2009. Two kinds of modified numerals. In T. Solstad &
A. Riester (eds.), Proceedings of Sinn und Bedeutung 13, Available at http:
//www.let.uu.nl/~Rick.Nouwen/personal/papers/sub09.pdf, 15 pages.
Nouwen, Rick. 2010a. Two puzzles of requirement. In Maria Aloni & Katrin
Schulz (eds.), The Amsterdam Colloquium 2009, Springer. http://www.
hum.uu.nl/medewerkers/r.w.f.nouwen/papers/neccsuff.pdf.
Nouwen, Rick. 2010b. What’s in a quantifier? In Martin Everaert, Tom Lentz,
Hannah de Mulder, Øystein Nilsen & Arjen Zondervan (eds.), The linguistic
enterprise: From knowledge of language to knowledge in linguistics (Lin-
guistik Aktuell/Linguistics Today 150), John Benjamins. Pre-published
version available at http://www.hum.uu.nl/medewerkers/r.w.f.nouwen/
papers/wiaq.pdf.
van Rooij, Robert. 2004. Signalling games select Horn strategies. Linguistics
and Philosophy 27(4). 493–527. doi:10.1023/B:LING.0000024403.88733.3f.
Schwager, Magdalena. 2005. Exhaustive imperatives. In Paul Dekker & Michael
Franke (eds.), Proceedings of the 15th Amsterdam Colloquium, Universiteit
van Amsterdam.
Solt, Stephanie. 2007. Few more and many fewer: complex quantifiers based
on many and few. In Rick Nouwen & Jakub Dotlacil (eds.), Proceedings of
the ESSLLI2007 Workshop on Quantifier Modification, .
Takahashi, Shoichi. 2006. More than two quantifiers. Natural Language
Semantics 14(1). 57–101. doi:10.1007/s11050-005-4534-9.
Umbach, Carla. 2006. Why do modified numerals resist a referential in-
terpretation? In Proceedings of SALT 15, 258 – 275. Cornell University
Press.
Zimmermann, Thomas Ede. 2000. Free choice disjunction and epis-
temic possibility. Natural Language Semantics 8(4). 255–290.
doi:10.1023/A:1011255819284.
Dr. R.W.F. Nouwen

Utrecht Institute for Linguistics OTS
Janskerkhof 13, NL-3512 BL
Utrecht, the Netherlands
R.W.F.Nouwen@uu.nl
3:41
doi: 10.3765/sp.3.4
Iffiness∗
Anthony S. Gillies
Rutgers University

Decision 2009-09-21 / Revised 2009-10-14 / Accepted 2009-11-18 / Final Version
Received 2010-01-17 / Published 2010-02-01
Abstract
How do ordinary indicative conditionals manage to convey conditional in-
formation, information about what might or must be if such-and-such is
or turns out to be the case? An old school thesis is that they do this by
expressing something iffy: ordinary indicatives express a two-place condi-
tional operator and that is how they convey conditional information. How
indicatives interact with epistemic modals seems to be an argument against
iffiness and for the new school thesis that if -clauses are merely devices for
restricting the domains of other operators. I will make the trouble both clear
and general, and then explore a way out for fans of iffiness.
Keywords: indicative conditionals, epistemic modality, if-clauses, conditionals,

strict conditionals, dynamic semantics
1 An iffy thesis
One thing language is good for is imparting plain and simple information:
there is an extra chair at our table or we are all out of beer. But — happily — we
∗ This paper has been around awhile, versions of it circulating since 05.2006 and accruing
a lot of debts of gratitude along the way. Chris Kennedy, Jim Joyce, Craige Roberts, Josef
Stern, Rich Thomason, audiences at the Rutgers Semantics Workshop (October 2007), the
Michigan L&P Workshop (Lite Version, November 2007), the Arché Contextualism & Relativism
Workshop (May 2008), the University of Chicago Semantics & Philosophy Language Workshop
(March 2009), and — especially (actually, especially∗ ) — Josh Dever, David Beaver, Kai von
Fintel, Brian Weatherson, and the anonymous S&P referees have all done their best trying
to save me from making too many howlers. But too many is surely context dependent, so
caveat emptor. This research was supported in part by the National Science Foundation
under Grant No. BCS-0547814.
©2010 A. S. Gillies
A. S. Gillies
do not only exchange plain information about tables, chairs, and beer mugs.
We also exchange conditional information thereof: if we are all out of beer, it
is time for you to buy another round. That is very useful indeed.
Conditional information is information about what might or must be, if
such-and-such is or turns out to be the case. My target here has to do with
how such conditional information manages to get expressed by indicative
conditionals (not so called because anyone thinks that’s a great name but
because no one can do any better). Some examples:
(1) a. If the goat is behind door #1, then the new car is behind door #2.
b. If the No. 9 shirt regains his form, then Barça might advance.
c. If Carl is at the party, then Lenny must also be at the party.
Each of these is an ordinary indicative, two of them have epistemic modals in

the consequent clause, and all of them express a bit of ordinary conditional
information.1 What I am interested in is how well the indicatives play with
the epistemic modals.
What these examples say is plain. Take (1b). This says that — within
the set of possibilities compatible with the information at hand — among
those in which the star striker regains his form, some are possibilities in
which Barça advance. Or take (1c). It says something about the occurrence
of Lenny-is-at-the-party possibilities within the set of Carl-is-at-the-party
possibilities — that, given the information at hand, every possibility of the
latter stripe is also of the former stripe. So what sentences like these say is
plain. How they say it isn’t. That’s my target here: How is it that the if s in
our examples manage to express conditional information and do so in a way
compatible with how they play with epistemic modals?
The simplest story about how the if s in our examples manage to express
conditional information is that each of them expresses the information of
a conditional. Which is to say: what these conditional sentences mean can
be read-off the fact that if expresses a conditional operator. Let’s say that
a story about if is iffy iff it takes if to express a bona fide operator, a bona
fide iffy operator (that is, a conditional operator properly so called), and the
same bona fide iffy operator in each of the sentences in (1). We will have to
sharpen that up by saying what it means for an operator to be a conditional
1 We ought to be careful to distinguish between conditional sentences (sentences of natural
language), conditional connectives (two-place sentential connectives in some regimented
language that may serve to represent the logical forms of conditional sentences), and
conditional operators (relations that may serve as the denotations of conditional connectives).
4:2
Iffiness
operator properly so called. But that is the gist: iffiness — a.k.a. the operator
view — is the thesis that ordinary indicative conditionals manage to express
conditional information because if expresses a conditional operator.
Depending on your upbringing, the operator view of if may well seem
either obvious or obviously wrongheaded. More on that below. Either way,
it is a hard line to maintain: how conditional sentences play with epistemic
modals seems to refute it. A seeming refutation isn’t quite the same as an
actual one, though. I will show that the refutation isn’t quite right by showing
how fans of iffiness can account for what needs accounting for. But before
showing how the operator view can be made to account for how if s and
modals interact I want to make it look for all the world like it can’t be done.
2 Doom and how to avoid it (sketches thereof)
The operator view is an old school story about indicatives. It says that if
expresses some relation between the (semantic value of the) antecedent and
consequent. So if takes its place alongside other connectives and expresses
an operator — the same operator — on the semantic values of the sentences it
takes as arguments.2 To tell a story like this we have to say exactly what that
operator is. But not just any telling will do. I want to show how our simple
examples cause what looks like insurmountable trouble (doom, even) for any
version of the operator view. Here’s an informal sketch of the trouble, what
rides on it, and how — eventually — we can and ought to get out of the mess.
Take this sketch as a promissory note that a formally precise version of all
that can be given; the rest of the paper makes good on that.
Suppose if expresses the limit case conditional operator of material
implication. Iffiness requires that in sentences like (1b) and (1c) either the
epistemic modals outscope the conditionals or the conditionals outscope
the modals. Neither choice gets the truth conditions right if the conditional
operator is the horseshoe. That’s easy to see (and well known).3 Linguists
grow up on arguments like that. That is one reason why even though the
operator view is the first thing a logician thinks of, it is the last thing a
linguist does.
2 If is a little word with a big history — a big history that we can’t adequately tour here. But
there are guides for hire: for instance, Bennett (2003) and von Fintel (2009).
3 The material conditional analysis of ordinary indicatives is defended (in somewhat different
ways) by, for example, Grice (1989), Jackson (1987), and Lewis (1976). A textbook version of
this “no-scope” argument that has the horseshoe analysis as its target appears in von Fintel
& Heim 2007.
4:3
A. S. Gillies
But (as I’ll show) this very same trouble holds no matter what conditional
operator an iffy story says if expresses. To see that requires two things. First,
we need to say in a precise way what counts as a conditional operator (Section
4). Given some pretty weak assumptions iffiness requires that if means all
(well, all relevant). Second, there are some characteristic Facts about how
indicatives and epistemic modals interact (Section 5). These neatly divide:
there are some consistency facts and there are some intuitive entailment
facts. The operator view requires that either the conditionals outscope the
modals or the modals outscope the conditionals. Something general then
follows: no matter what conditional operator we say if expresses, one scope
choice is ruled out by the consistency facts, the other by the entailments
(Section 6).
That seems to be bad news for any fan of any version of the old school
operator view. And there seems to be more bad news in the offing since
the operator view isn’t the only game in town (in some circles, it’s a game
played only on the outskirts of town). The anti-iffiness rival — a.k.a. the
restrictor view — is a new school approach. It embraces Kratzer’s thesis that
if is not a connective at all: it doesn’t express an operator, a fortiori not
an iffy operator, and a fortiori not the same iffy operator in each of our
example sentences it figures in.4 Instead, says the restrictor analysis, if
simply restricts other operators. In the cases we will care about, it restricts
(possibly covert) epistemic modals. The restrictor view makes embarrassingly
quick work of the data that spells such trouble for the operator view (Section
7).
But the success of the restrictor analysis is no argument against Chuck
Taylors and skyhooks tout court. That’s because there are old school stories
that say that if expresses a strict conditional operator over possibilities
compatible with the context, and that it can do all the restricting that needs
doing (Sections 8). Once we see just how, we can look back and see more
4 The restrictor view gets its inspiration from Lewis’s (1975) argument that certain if s (under
adverbs of quantification) cannot be understood as expressing some conditional but rather
serve to mark an argument place in a polyadic construction. Kratzer’s thesis is that this holds
for if across the board. The classic references are Kratzer 1981, 1986. There is another rival,
too: some take if to be an operator, but an operator that does not (when given arguments)
express a proposition (Adams 1975; Gibbard 1981; Edgington 1995, 2008). Instead, they say,
if s express but do not report conditional beliefs on the part of their speakers. I will ignore
this view here: it doesn’t really start off as the most plausible candidate, the trouble I make
here about how if s and modals interact makes it less plausible not more, and it will just take
us too far afield.
4:4
Iffiness
clearly what is at stake in the difference between new school and old, why
iffiness is worth pursuing (Section 9), and how this version of the old school
story relates to recent dynamic semantic treatments (Section 10).
3 Ground rules
Let’s simplify. Assume that meanings get associated with sentences by getting
associated with formulas in an intermediate language that represents the
relevant logical forms (lfs) of them. Thus a story, old school or otherwise,
has to first say what the relevant lfs are and then assign those lfs semantic
values.
We will begin with an intermediate language L that has a conditional
connective that will serve to represent the lfs of ordinary indicatives. So let
L be generated from a stock of atomic sentence letters, negation (¬), and
conjunction (∧) in the usual way. But L also has the connective (if ·)(·),
and the modals must and might. What I have to say can be said about
an intermediate language that allows that the modals mix freely with the
formulas of the non-modal fragment of L but restricts (if ·)(·) so that it
takes only non-modal sentences in its first argument. So assume that L is
such an intermediate language. When these restrictions outlive their utility,
we can exchange them for others.5
Iffiness requires that the if of English expresses something properly iffy.
That leaves open just which conditional operator we say that the if of English
means. But our choices here are not completely free, and some ground rules
will impose some order on what we may say. These will constrain our choice
by saying what must be true for a conditional operator to be rightfully so
called. But before getting to that, I’ll start with what I will assume about
contexts.
First, a general constraint: assume that truth-values — for the if s and
the modals (when we come to that), as well as for the boolean fragment of
L — are assigned at an index (world) i with respect to a context. I will assume
that W , the space of possible worlds, is finite. Nothing important turns on
this, and it simplifies things.
For the fragment of L with no modals and no if s, contexts are idle. It will
be the job of the modals to quantify over sets of live possibilities and the job
5 Conventions: p, q, r , . . . range over sentences of L (subject to our constraints on L); i, j, k, . . .
range over worlds; and P , Q, R, . . . range over sets of worlds. And let’s not fuss over whether
what is at stake is the ‘if ’ of English or the ‘if ’ of L; context will disambiguate.
4:5
A. S. Gillies
of contexts to select these sets of worlds over which the modals do their job.
What I want to say can be said in a way that is agnostic about just what kinds
of things contexts are: all I insist is that, given a world, they determine a set
of possibilities that modals at that world quantify over.6 The functions doing
the determining need to be well-behaved.
Given a context c — replete with whatever things contexts are replete
with — an epistemic modal base C determined by it is just what we need:
Definition 3.1 (modal bases). Given a context c, C is a modal base (for c)

only if:

C = λi. j : j is compatible with the c-relevant information at i
Since the only context dependence at stake here will be dependence on

such bases, we can get by just as well by taking them to go proxy for bona
fide contexts, granting them the honorific “contexts”, and relativizing the
assignment of truth-values to index–modal base pairs directly. So we’ll be
saying just which function ·C,i : L → {0, 1} is, where C represents the
relevant contextual information. No harm comes from that, and it makes for
a prettier view.7
But not just any function from indices to sets of indices will do as a
(proxy) context. So we constrain C’s accordingly, requiring that they are
well-behaved — that is, reflexive and euclidean:
6 The problems and prospects for iffiness are independent of just whose information in a
context — speaker, speaker plus hearer, just the hearer, just the hearer’s picture of what the
speaker intends, and so on — counts for selecting the domains for the modals to do their job,
and whether or not that information is information-at-a-context at all. So let’s keep things
simple here. If you’d rather be reading a paper which has these (and other) complexities at
the forefront, see von Fintel & Gillies 2007, 2008a,b and the references therein.
7 Three comments. First: take ·C to be shorthand for i : ·C,i = 1 . If p’s denotation

0
is invariant across contexts – if pC = pC no matter the choice for C and C 0 – let’s
agree to conserve a bit of (virtual) ink and sometimes omit the superscript: so, e.g., the
if s I am focusing on here have non-modal antecedents, and so those antecedents will be
context-invariant. Second: it’s a little misleading to say that the only context dependence
is dependence on modal bases since we will want to allow the possibility that what worlds
are relevant to an if at a world can vary across contexts. But, in fact, we can (and will)
still leave room for that possibility by constraining how contexts and the sets of if -relevant
possibilities relate. Third: if I had different ambitions, we couldn’t simplify quite like this. If
the interaction at center stage were how if s and quantifiers interact, or if the modals in the
if /modal interaction were deontic, then we’d want our contexts to rightly characterize the
kind of information at stake and taking them to determine sets of possibilities compatible
with what is known would not do. But my ambitions here aren’t different from what they
are.
4:6
Iffiness
Definition 3.2 (well-behavedness). C is well-behaved iff:

i. i ∈ Ci (reflexiveness)
ii. if j ∈ Ci then Ci ⊆ Cj (euclideanness)
C represents a (proper) context only if it is well-behaved.
Observation 3.1. If C is well-behaved then Ci is closed — well-behavedness

implies that if j ∈ Ci , then Cj = Ci .
Proof. Suppose j ∈ Ci . Consider any k ∈ Cj . Since C is euclidean and j ∈ Ci ,

Ci ⊆ Cj . Since C is reflexive, i ∈ Ci and thus i ∈ Cj . Appeal to euclideanness
again: since k ∈ Cj , Cj ⊆ Ck ; but i ∈ Cj and so i ∈ Ck . And once more: since
i ∈ Ck , Ck ⊆ Ci . And now reflexiveness: k ∈ Ck and so k ∈ Ci . (The
inclusion in the other direction just is euclideanness.)
Gloss Ci as the set of live possibilities at i in C. That Ci is closed means

that the live possibilities in Ci do not vary across worlds compatible with C.8
4 Conditional operators
By saying something about what must be true of an operator for it to be

a conditional operator properly so called we thereby say something about
what must be true for a story to be iffy. Taking if to express a bona fide
conditional operator requires, minimally, two things.
Thing one: it requires, in the cases we’ll care about, that if such-and-
such, then thus-and-so doesn’t take a stand on whether such-and-such is
the case and so conditionals like that are typically happiest being uttered
in circumstances in which such-and-such is compatible with the context as
it stands when the conditional is issued. I will take this as a definedness
condition on the semantics for our conditional connective.
Definition 4.1 (definedness). if p q C,i is defined only if p is compatible

with Ci .
This is a weak constraint.9

8 Given euclideaness, we could get by with different assumptions on C to the same effect.
But reflexiveness is a constraint it makes sense to want since, when we come to them,
epistemic modals — what might or must be in virtue of what is known — in a given context
will quantify over the set of possibilities compatible with that context.
9 The motivating idea isn’t novel (see, e.g., Stalnaker 1975): if it’s ruled out that p in C,
and you want to say something conditional on p in C, then you should be reaching for a
4:7
A. S. Gillies
Thing two: it requires that if expresses a relation between antecedent

and consequent. Whether if such-and-such, then thus-and-so is true depends
on whether the relevant worlds at which such-and-such is true bears the
right relationship to the worlds where thus-and-so is true. Take an arbitrary

conditional like if p q at i, in C. And let P and Q be the sets of antecedent
and consequent possibilities so related by the if . Now we need to zoom in on
the relevant worlds in P . So let Di be the set of if -relevant worlds at i. For if
to express a conditional operator properly so called, its denotation must be a
relation R between P -together-with-the-relevant-possibilities-Di and Q.
Di is the set of possibilities relevant for the if at i. Since Di is a function
of i, different worlds may be relevant for one and the same if when evaluated
at different worlds. But, depending on your favorite theory, Di may be a
function of more than just i: it may be a function of i, of C, of p, of q,
or of your kitchen sink. We will return to that shortly. No matter your
favorite theory, we can still ex ante agree to this much: i is always among
the possibilities relevant for an if at i, and only possibilities compatible with
the context are relevant for an if at i. That is: Di is the set of if -relevant
worlds at i only if i ∈ Di and Di ⊆ Ci . The first requirement is a platitude:
the facts at a world are always relevant to whether an indicative at that world
is true. The second means that an indicative in a context is supposed to say
something about the possibilities compatible with that context.
Beyond this, what your favorite theory implementing the operator view
says about Di may vary because what stories say counts as an if -relevant
possibility varies. But what does not vary is that all such stories determine
Di in a pretty straightforward way and so the denotation they assign to if
can be put as a relation between the relevant antecedent possibilities and the
consequent possibilities. Three examples:
Example 1 (variably strict conditional). Suppose your favorite story
takes if to be a variably strict conditional based on some underlying ordering
of possibilities (Stalnaker 1968; Lewis 1973). For every world i, let i be
an ordering of worlds, a relation of comparative similarity (at least) weakly

centered on i. Given a conditional if p q at i in C, you will want to
identify Di with the set of possibilities no more dissimilar than the most
similar p-world to i, restricted by Ci .
Example 2 (strict conditional). Suppose your favorite Lewis-inspired story
counterfactual not an indicative. That can be implemented in any number of ways, including
making it a presupposition of if -clauses (see, e.g., von Fintel 1998a).
4:8
Iffiness
comes not from D.K. but from C.I. You thus take if to be strict implication
(restricted to C). But that, too, can be put in terms of orderings: your ordering
i is universal, treating all worlds the same. Whence it follows that — since
the nearest p-world is the same distance from i as is every world — taking
Di to be the set of possibilities no further from i as the nearest p-world
amounts to taking Di to be the set of all worlds W , restricted by Ci .
Example 3 (material conditional). Suppose you are smitten by truth-tables,
and your favorite incarnation of the operator view is the material conditional
story. Equivalently: you will have a maximally discerning ordering (every
world an island) and take Di to be the set of closest worlds to i simpliciter
according to that ordering. For an if at i you will thus take Di to be {i}.
(For an if at some other world j, even an if with the same antecedent and

consequent as the one at i, take Dj to be j .)
Summing this all up: even before taking a stand on just what relation
between relevant antecedent possibilities and consequent possibilities that if
must express in order to express a conditional operator properly so called,
we know that it must still express such a relation. So let’s insist that we
can put things that way, parametric on just how Di gets picked out and
so parametric on what counts as “relevant” antecedent possibilities and so
parametric on the details of your favorite theory:
Definition 4.2 (relationality). (if ·)(·) expresses a conditional only if its

truth conditions can be put this way:
if defined, if p q C,i = 1 iff R(Di ∩ P , Q)

for some set of possibilities Di and relation R, where i ∈ Di and Di ⊆ Ci .
But not just any relation between Di ∩ P and Q counts as a conditional

relation properly so called. I insist on three minimal constraints on R, for any
P and Q: (i) that Di ∩ P imposes some order on the set of Q’s so related; (ii)
that Q matters to whether the relation holds; and (iii) that — plus or minus
just a bit — only the relationship between the possibilities in Di ∩ P and
the possibilities in Q matter to whether the relation holds. These are not
controversial, but do bear some unpacking.10
First, the order imposed by the antecedent:
10 This general way of characterizing conditionality is not new: both the assumptions and
the results here are inspired by van Benthem’s (1986: §4) investigation of conditionals as
generalized quantifiers. There are, however, differences between his versions and mine.
4:9
A. S. Gillies
Definition 4.3 (order). R is orderly iff:

i. R(Di ∩ P , P )
ii. R(Di ∩ P , Q) and Q ⊆ S imply R(Di ∩ P , S)
iii. R(Di ∩ P , Q) and R(Di ∩ P , S) imply R(Di ∩ P , Q ∩ S)
R is something (if ·)(·) at i could mean only if it is orderly.
Such R’s are precisely those for which the set of Q’s a Di ∩ P bears it to
form a filter that contains P .11 That is an aesthetic reason for constraining
R this way. Such R’s also jointly characterize the basic conditional logic.12
The relational properties correspond to reflexivity, right upward monotonic-
ity, and conjunction. That is another — only partly aesthetic — reason for
constraining them this way.
Second, R must care about consequents. This is just the requirement that
conditional relations, like quantifiers, be active:
Definition 4.4 (activity). R is active iff:
if Di ∩ P 6= then there is a Q and Q0 such that: R(Di ∩ P , Q) but not

R(Di ∩ P , Q0 )
R is something (if ·)(·) at i could mean only if it is active.
This means that R cares about how Di ∩ P relates to Q. So long as there

are some relevant P -possibilities, there have to be some Q’s for which the
relation holds and some for which it doesn’t.
And finally: R is a relation between the sets of possibilities. Thus if R
holds at all between P -plus-the-relevant-possibilities-Di and the consequent-
possibilities Q, R will hold between any two sets of things that play the right
possibility role. Intrinsic properties of worlds don’t count for or against the
relation holding. The idea is simple, the execution harder. That is because
I have allowed you to choose your favorite iffy theory, and what goes into
determining Di depends on your choice.
What is important is this: suppose your favorite story posits some ad-
ditional structure to modal space to find just the right worlds which, when
combined with P , gives the set of worlds relevant for evaluating Q. That
means that your favorite story cares about how P relates to Q but also about
the distribution of the worlds in P compared to the distribution in Q — for
11 It follows straightaway that orderly R’s are fully reflexive in the sense that R(Di ∩ P , Di ∩ P ).
12 See Veltman 1985 for a proof.
4:10
Iffiness
example, perhaps insisting that it is the closest worlds in P to i that must bear
R to Q. If we systematically swap possibilities for possibilities in a way that
preserves the relevant structure, then the conditional relation ought to hold
pre-swapping iff it holds post-swapping. And mutatis mutandis for Di : since
once the posited structure does its job determining Di , then any systematic
swapping of possibilities that leaves the domain untouched should also leave
the conditional relation untouched.13
Where π is such a mapping and P a set of worlds, let π (P ) be the set of
worlds i such that π (j) = i for some j ∈ P . Then:
Definition 4.5 (quality). R is qualitative iff:
R(Di ∩ P , Q) implies R(π (Di ∩ P ), π (Q))
R is something (if ·)(·) at i could mean only if it is qualitative.
This does generalize the familiar constraint on quantifiers — it allows condi-

tional operators to care about both the relationship between P and Q and
also where the satisfying worlds are. If i is the universal ordering then this
requirement reduces to the more familiar quantitative one (restricted to Ci ).
And if Di = {i}, it trivializes.
I am insisting that a story is iffy only if the truth conditions for an indica-

tive if p q at i in Ci can be put as a relation between R between Di ∩ P and
Q. And we have insisted that the relation be constrained in sensible ways — it
must impose some order on sets of consequent possibilities, it must care
about consequents, and it must not care about the intrinsic properties of pos-
sibilities. Each example of an instance of the operator view above — variably
strict, strict, and material conditionals — lives up to these constraints. Still, it
seems like for all we have said it is possible to take the conditional to be true
just in case most/many/several/some/just the right possibilities in Di ∩ P
are in Q. But that is not so: given our constraints, if must mean all.14
13 This is the natural extension of the familiar requirement that quantifiers be quantitative:
for Q to be a quantifier (with domain E) it must be that QE (A, B) iff QE (f (A), f (B)) where
f is an isomorphism of E. Once we have structure to our domain, this will not do. The
more general constraint is then to require that Q be invariant under O-automorphisms of
the domain, where O is the ordering that imposes the posited structure. We can get by with
slightly less: namely, stability under Di -invariant automorphisms.
14 Well, all relevant. This was first proved by van Benthem — see, e.g., van Benthem 1986. The
version I give is simpler (we’re ignoring the infinite case) and a bit more general (slightly
weaker assumptions); the proof is based on one in Veltman 1985, but generalizes it slightly.
4:11
A. S. Gillies
Observation 4.1. Assume R is a conditional relation properly so called. Then

R(Di ∩ P , Q) iff Di ∩ P ⊆ Q.
Proof. I care about the left-to-right direction.

Suppose — for reductio — that R(Di ∩ P , Q) but Di ∩ P 6⊆ Q. What we’ll
see is: (i) R(Di ∩ P , P ∩ Q); (ii) the world that witnesses that Di ∩ P 6⊆ Q can
be exploited (by quality) to show that no world in P ∩ Q plays a role in
R(Di ∩ P , P ∩ Q) holding — from which it follows that R(Di ∩ P , ); (iii) from
which it follows that Di ∩ P must be empty — a contradiction.
(i): By hypothesis R(Di ∩ P , Q). By order it follows that R(Di ∩ P , P ) and
hence that R(Di ∩ P , P ∩ Q).
(iia): Claim: Di ∩ P ∩ Q 6= . Proof of Claim: Assume otherwise. order
guarantees that R(Di ∩ P , Di ∩ P ). By hypothesis R(Di ∩ P , Q), and so by
order R(Di ∩ P , Di ∩ P ∩ Q). Applying the assumption that Di ∩ P ∩ Q = :
R(Di ∩ P , ). Appeal to order again and we have that R(Di ∩ P , S) for any
S. But then Di ∩ P must be empty (activity), contradicting the assumption
that Di ∩ P È Q and proving the Claim.
(iib): Let j be a witness to Di ∩ P 6⊆ Q. So j ∈ Di ∩ P but j 6∈ Q. Now pick
any confirming instance k — that is, any k ∈ Di ∩ P ∩ Q — and let π be the
mapping that swaps k and j and leaves all else untouched:
• π (j) = k
• π (k) = j

• π (i) = i for every i 6∈ j, k
By (i) R(Di ∩ P , P ∩ Q). Hence, by quality, R(π (Di ∩ P ), π (P ∩ Q)). But π
doesn’t affect Di ∩ P . So: R(Di ∩ P , π (P ∩ Q)). That is: R holds between
Di ∩ P and both P ∩ Q and π (P ∩ Q). Hence — by order — it holds also
between Di ∩ P and their intersection: R(Di ∩ P , (P ∩ Q) ∩ π (P ∩ Q)). But

π (P ∩ Q) = ((P ∩ Q) \ {k}) ∪ j , so their intersection is (P ∩ Q) \ {k}. So:
R(Di ∩P , (P ∩Q)\{k}). Which is to say that k is irrelevant for R’s holding. But
k was any world in Di ∩ P ∩ Q, so finiteness plus order implies R(Di ∩ P , ).
(iii): Appeal to order again: since R(Di ∩ P , ), it holds that for any S
whatever R(Di ∩ P , S). Whence, by activity, it follows that Di ∩ P = . And
that contradicts the assumption that Di ∩ P 6⊆ Q.
The intuitive version is just this: if R holds between Di ∩ P and Q then the
former must be included in the latter. That is because if things didn’t go that
way then the witnessing counterexample world could play the role of any
one of the confirming worlds. But that would mean that confirming worlds
4:12
Iffiness
play no role. Nothing like that could be something a conditional properly so

called could mean. So Di ∩ P must be included in Q after all.
5 Three facts
Iffiness requires that if is a conditional connective that expresses a con-

ditional operator, and that pretty much means that if has to mean all. It
requires that no matter what other operators we might find in its neighbor-
hood. That spells trouble because of three simple Facts about how indicative
conditionals and epistemic modals play together.15
I have lost my marbles. I know that just one of them — Red or Yellow — is
in the box. But I don’t know which. I find myself saying things like:
(2) Red might be in the box and Yellow might be in the box.
So, if Yellow isn’t in the box, then Red must be.
And if Red isn’t in the box, then Yellow must be.
Conjunctions of epistemic modals like Red might be in the box and Yellow
might be in the box are especially useful when the bare prejacents partition
the possibilities compatible with the context. The first fact is simply that if s
are consistent with such conjunctions of modals.
Fact 1 (consistency). Suppose S1 and S2 partition the possibilities compati-

ble with the context. Then the following are consistent:
i. might S1 and might S2
ii. if not S1 , then must S2 ; and if not S2 , then must S1
15 Three notes about the Facts. First: “Facts” may be laying it on a little thick. The judgments
are robust, and the costs high for denying the generalizations as I put them. That’s all true
even if what we may say about them is a matter for disputing. But it does not much matter:
what I really care about is three characteristic seeming facts about if s, mights, and musts
that at first blush look like the kind of thing our best story ought to answer to. So let’s agree
to take them at face value and see where that leads. Later, if your English breaks with mine
or if your old school pride overwhelms, you can deny the Facts or explain them away as your
preferences dictate. Second: the Facts may seem eerily familiar. They are not far removed
from the sorts of examples of the interplay between adverbs of quantification and if -clauses
in Lewis 1975 and Kratzer 1986. That is no coincidence, as we’ll see (briefly) in Section 7.
Third: since the operator view isn’t the only game in town and since predicting the Facts
is something any story (old school or otherwise) must do, we should state the Facts in a
way that is agnostic on the iffy thesis. So the Facts characterize what is true of sentences
in (quasi-)English, not necessarily what is true of their lfs in our regimented intermediate
language.
4:13
A. S. Gillies
I do not know whether Carl made it to the party. But wherever Carl goes,
Lenny is sure to follow. So if Carl is at the party, Lenny must be — Lenny is at
the party, if Carl is. We just glossed an if with a commingling epistemic must
by a bare if with no (overt) modal at all. Thus:
(3) a. If Carl is at the party, then Lenny must be at the party. ≈

b. If Carl is at the party, then Lenny is at the party.
This pair has the ring of (truth-conditional) equivalence. Fact 2 below records
that. But there are also arguments for thinking that the truth-value of (3a)
should stand and fall with the truth-value of (3b).
For suppose that such if s validate a deduction theorem and modus
ponens, and that must is factive.16 The left-to-right direction: assume that
(3a) is true. And consider the argument:
(4) If Carl is at the party, then Lenny must be at the party.

Carl is at the party.
So: Lenny is at the party.
The first two sentences — intuitively speaking — entail the third. And that is
pushed on us by the assumptions: from the first two sentences we have (by
modus ponens) that Lenny must be at the party, which by factivity entails
Lenny is at the party. Apply the deduction theorem and we have that If Carl
is at the party, then Lenny must be at the party entails If Carl is at the party,
then Lenny is at the party. Since we have assumed that (3a) is true, it follows
that (3b) must be. There are spots to get off this bus to be sure — by denying
either modus ponens or by denying the factivity of must — but those costs
are high.17
The right-to-left direction: assume that (3b) is true and consider:
16 Remember that, for now, we are dealing with properties of sentences of (quasi-)English not
properties of those sentences’ lfs in some regimented language. The argument here isn’t
meant to convince you of Fact 2, it is meant to make some of the costs of denying the data
vivid. Geurts (2005) also notes that bare conditionals and their must-enriched counterparts
are “more or less equivalent”.
17 You have to troll some pretty dark corners of logical space for deniers of modus ponens,
but that’s not true for deniers of the factivity of must. That view has something of mantra
status among linguists (philosophers are surprised to hear that). Mantra or not, it is wrong.
For an all-out attack on it see von Fintel & Gillies 2010. Here is just one sort of consideration:
if must p didn’t entail p (because must is located somewhere below the top of the scale of
epistemic strength), then you’d expect must to combine with only in straightforward ways
the way might can:
4:14
Iffiness
(5) If Carl is at the party, then Lenny is at the party.

Carl is at the party.
So: Lenny must be at the party.
This is as intuitive an entailment as we are likely to find. Whence it follows by

the deduction theorem that If Carl is at the party, then Lenny is at the party
on its own entails If Carl is at the party, then Lenny must be at the party. So
if (3b) is true so must be (3a): that’s why the former seems to gloss the latter.
Fact 2 (if/must). Conditional sentences like these are true in exactly the
same scenarios:
i. if S1 , then must S2
ii. if S1 , then S2
The glossing that this pattern permits is a nifty trick. But that is only half
the story since if can also co-occur with epistemic might. The interaction
between if and might is different and underwrites a different glossing.
Alas, my team are not likely to win it all this year. It is late in the season
and they have made too many miscues. But they are not quite out of it. If
they win their remaining three games, and the team at the top lose theirs,
my team will be champions. But our last three are against strong teams
and their last three are against cellar dwellers. Still, my spirits are high:
if we win out, we might win it all. Put another way, within the (relevant)
my-team-wins-out possibilities — of which there are some — lies a my-team-
wins-it-all possibility; there is a my-team-wins-out possibility that is a my-
team-wins-it-all possibility. But that is just to say that there are (relevant)
my-team-wins-out-and-wins-it-all possibilities. Maybe not very many, and
maybe not so close, but some.18
Apart from keeping hope alive, the example also illustrates that we can
gloss an indicative with a co-occurring epistemic might by a conjunction
under the scope of might:
(6) a. If my team wins out, they might win it all. ≈

b. It might turn out that my team wins out and wins it all.
(i) a. I didn’t say it is raining, I only said it might be raining.
b. #I didn’t say it is raining, I only said it must be raining.
But it doesn’t.
18 For the record: the Cubs. Please don’t bring it up.
4:15
A. S. Gillies
That gloss sounds pretty good. And for good reason: conjunctions that you
would expect to be happy if the truth of (6a) and (6b) could come apart are
not happy at all:
(7) a. #If my team wins out, they might win it all; moreover, they can’t win
out and win it all.
b. #It might turn out that my team wins out and wins it all, and, in
addition there’s no way that if they win out, they might win it all.
That gives us the third Fact about how if s play with modals.19
Fact 3 (if/might). Sentences like these are true in exactly the same scenarios:
i. if S1 , then might S2
ii. it might be that [S1 and S2 ]
It’s now a matter of telling some story, iffy or otherwise, that answers to
these Facts. Old school operator views will have trouble with them; the new
school restrictor view predicts them trivially.
6 Scope matters
The operator view takes if to express an operator, an iffy operator, and the
same iffy operator no matter whether we have a co-occurring epistemic modal
or not and no matter whether the modal is must or might. In cases where
there is a modal, scope issues have to be sorted out. Take a sentence of the
form
(8) If S1 then modal S2

19 There is a wrinkle: Fact 3 implies that if S1 , then might S2 is true in just the same spots as if
S2 , then might S1 . Seems odd:
(i) a. If I jump out the window, I might break a leg.

b. If I break a leg, I might jump out the window.
The first is true, the second an overreaction. I intend, for now, to sweep this under the same
rug that we sweep the odd way in which Some smoke and get cancer/Some get cancer and
smoke don’t feel exactly equivalent even though Some is a symmetric quantifier if ever there
was one. (The rug in question seems to be the tense/aspect rug; similar considerations drive
von Fintel’s (1997) discussion of contraposition of bare conditionals.)
4:16
Iffiness
and let S10 (S20 ) be the L-representation for sentence S1 (S2 ), and modal the
L-representation for modal. We have a short menu of options for the relevant
lf for such a sentence — either the narrowscoped (9a) or the widescoped (9b):
if S10 modal S20

(9) a.
modal if S10 S20

b.
If you want to put your lfs in tree form, be my guest: opting for nar-
rowscoping means opting for sisterhood between modal and S2 ; opting for
widescoping means opting for sisterhood between modal and if S1 then S2 .
The trouble for the operator view is that, since if has to express inclusion,
neither choice will do. One choice for scope relations seems ruled out by
consistency (Fact 1), the other by if/must (Fact 2) and if/might (Fact 3).
To put the trouble precisely, we need one more ground rule. Contexts,
we said, have the job of determining the domains the modals quantify over.
Modals, I’ll assume, do their job in the usual way by expressing their usual
quantificational oomph over those domains: must (at i, with respect to C)
acts as a universal quantifier, and might as an existential quantifier, over Ci .
Definition 6.1 (modal force).

i. might pC,i = 1 iff Ci ∩ pC 6=
ii. must pC,i = 1 iff Ci ⊆ pC
Now suppose we plump for narrowscoping. Then, given the ground rules,
we cannot predict the consistency of the likes of (2) and that means that we
cannot square iffiness with Fact 1. That’s true no matter how you fill in the
particulars of the iffy story.
Here is the narrowscoped analysis of my lost marbles. We have a modal
and two indicatives:
(10) a. Red might be in the box and Yellow might be in the box.
might p ∧ might q
b. If Yellow isn’t in the box, then Red must be.

if ¬q must p
c. If Red isn’t in the box, then Yellow must be.

if ¬p must q
Any good story has to allow that the bundle of if s in (10b) and (10c) is
consistent with the conjunction in (10a). But, assuming narrowscoping,
4:17
A. S. Gillies
this — even without taking a stand on how we choose Di and so without

taking a stand on what counts as the set of if -relevant worlds — seems to be
beyond what can be delivered by any version of the operator view.
Observation 6.1. Suppose p and q partition the possibilities in C and that

(10a) is true. Then the (narrowscoped) sentences in (10) can’t all be true.
Proof. Suppose otherwise — that the regimented formulas in L are all true at
a live possibility, say i, with respect to C. Just one of my marbles is in the
box. So any world in Ci is either a p-world or a q-world, but not both; C is
well-behaved, so i ∈ Ci . That leaves two cases.
case 1: i ∈ ¬q. By hypothesis if ¬q must p C,i = 1, and so Di ∩

¬qC ⊆ must pC . Since i ∈ Di , it then follows that i ∈ must pC — which
is to say must pC,i = 1. Thus Ci has only p-worlds in it. But that is at
odds with the second conjunct of (10a): that might q is true at i guarantees a
q-world, hence a ¬p-world, in Ci .
case 2: i ∈ ¬p. By hypothesis if ¬p must q C,i = 1, and so Di ∩

¬pC ⊆ must qC . Since i ∈ Di , it then follows that i ∈ must qC — which
is to say must qC,i = 1. Thus Ci has only q-worlds in it. But that is at odds
with the first conjunct of (10a): that might p is true at i guarantees a p-world,
hence a ¬q-world, in Ci .
Narrowscoping has the virtue of taking plain and simple lfs to represent
indicatives with apparently epistemic modalized consequents. But it has the
vice of not squaring with consistency. This is true no matter the particulars
of your favorite version of the operator view.20
So suppose instead that co-occurring modals scope over the if -constructions
in which they occur. Now it is the generalizations if/must and if/might
that cause trouble. Again, that’s true no matter how Di is chosen and so
no matter what counts as an if -relevant possibility and so no matter what
conditional operator we say if expresses.
Here is a widescope analysis of the key examples (3) and (6):
(11) a. If Carl is at the party, then Lenny must be at the party.

must if p q

if p q
20 Thus by supplying how your favorite version of the operator view says Di is determined, you
can use this proof to show how that story (assuming narrowscoping) departs from Fact 1.
4:18
Iffiness
(12) a. If my team wins out, they might win it all.

might if p q
might (p ∧ q)

The facts are that must if p q ≈ if p q and that might if p q ≈
might (p ∧ q). What we need is a semantics for the conditional connec-
tive (if ·)(·) that can predict both patterns. But paths that might lead to one
pretty reliably lead away from the other.
So far I have insisted that i is always among the relevant worlds to an
if at i (i ∈ Di ) and also that only worlds compatible with the context are
relevant (Di ⊆ Ci ). Here I am in good company. But perhaps there is even
more interaction between domains of if -relevant worlds and contexts.
Some theories say that there can be no difference in domains for condi-
tionals between worlds compatible with the context, others disagree:
Definition 6.2 (egalitarianism & chauvinism).

i. A semantics is egalitarian iff if whenever j ∈ Ci then Dj = Di .
ii. A semantics is chauvinistic iff it is not egalitarian.
egalitarianism requires domains to be invariant across worlds compati-

ble with a context. That means that distinctions between worlds made by
D’s — this world is relevant, that one isn’t — are unaffected when those dis-
tinctions are made from behind the veil of ignorance (we don’t know which
world compatible with C is the actual world). Chauvinistic theories allow
differences from behind the veil to matter to what possibilities get selected
for domainhood, and thus allow that a possibility j ∈ Ci may determine a
different set of relevant possibilities than does i. Once we have agreed that,
for any i, Di selects from the worlds compatible with C and must include i,
it is a further question whether we want to be egalitarians or chauvinists.21
21 The history of the conditional is littered with chauvinists. The material conditional analysis
is chauvinistic. It says that the only possibility relevant for the truth of an if at i in C is
i itself. And similarly for an if at j: only j matters there. Thus, except in the odd case
where the context rules out uncertainty altogether, we will have that Dj 6= Di , for any choice
of i and j compatible with C. A variably strict conditional analysis, based on a family
of orderings (one for each world), is chauvinistic if we do not impose an “absoluteness”
condition — the requirement that orderings around any two worlds be the same. (Lewis
(1973: §6) discusses absoluteness in the process of characterizing the V -logics.) What to say
about absoluteness is optional and so there is room for agnosticism about chauvinism.
Stalnaker’s (1975) treatment of indicatives is not officially agnostic about chauvinism, but
4:19
A. S. Gillies
It is hard to be a chauvinist. That is because, assuming the particulars

of the chauvinistic theory are compatible with there being a (p ∧ ¬q)-world
in Ci but not in Di , no such story will predict if/must. The data say that
bare indicatives and their must-enriched counterparts are true in the same
scenarios. But chauvinism plus widescoping guarantees that the domain
the if quantifies over is properly included in the domain its must-enriched
counterpart quantifies over. Thus the former says something strictly weaker
than — true in strictly more spots than — the latter. That is at odds with Fact
2:
Observation 6.2. Suppose that Di ⊂ Ci . There are scenarios in which the

widescoped (11b) is true but (11a) isn’t. Thus chauvinism plus widescoping
can’t explain Fact 2.
Proof. Consider a (p ∧ ¬q)-world — call it j — and suppose that Ci does, but

Di does not, contain j. Then every possibility in Di ∩ p is in q and the
plain if is true (at i, in C): if p q C,i = 1. But not the widescoped must-

enriched if . That is because there is a world in Ci — namely j — such that not

every possibility in Dj ∩ p is a possibility in q. Thus if p q C,j = 0

and so it is not true that the plain if is true at every world in Ci and so
must if p q C,i = 0.

Again, this is true no matter how we fill in the particulars of the operator
view. If we widescope the modals, and the story is chauvinistic, it will not
square with Fact 2.
Given widescoping, egalitarianism fares no better. But here it is
if/might (Fact 3) that causes trouble. This time the issue is triviality: must-
enriched if s are true iff their might-enriched counterparts are.
Here is why. First, egalitarianism implies that Di covers Ci :
Observation 6.3. egalitarianism implies that Di = Ci .
Proof. Assume otherwise. Di ⊆ Ci , so there must be a j ∈ Ci such that j 6∈ Di .

By egalitarianism, Dj = Di . But we know that j ∈ Dj . Contradiction.
that is only because he requires that i induce a total order that is centered pointwise on
i, and that rules against absoluteness. But the pragmatic mechanisms he develops there
are agnostic on the chauvinism question — what he says about how the context constrains
selection functions is compatible with both egalitarianism and chauvinism. I myself see
little reason to go for chauvinism.
4:20
Iffiness
Thus if Di reflects some measure of proximity to i, egalitarianism

implies that the underlying ordering is centered not pointwise on i but
setwise on the worlds compatible with C. So egalitarianism implies that
if is really a strict conditional. That’s true whether Di is derived from some
underlying ordering or not: if , might and must quantify over the same
domain of possibilities, and an if is true at i iff all of the antecedent worlds
in that domain are consequent worlds.22 That means that an if at i (in C)
is true iff the corresponding material conditional is true at every possibility
compatible with C. And that means that such an if is true at i iff the material
conditional, widescoped by must, is true at i.23
But from this degree of fit between Di and Ci it follows straightaway that
no two possibilities compatible with C can differ over an if issued in C. There
is solidarity among if s; they stand and fall together:
Observation 6.4. egalitarianism implies
if p q C,i = 1 iff for every j ∈ Ci : if p q C,j = 1

Proof. if p q C,i = 1 iff Di ∩ p ⊆ q. By egalitarianism: iff, for any

j ∈ Ci , Dj ∩ p ⊆ q. Equivalently: iff, for any j ∈ Ci , Cj ∩ p ⊆ q — that

is, iff for every such j, if p q C,j = 1.

Given widescoping, any story with this equivalence will have a hard time
saying why conditionals like (12a) seem to be true iff modalized conjunctions
like (12b) are and so will have trouble with if/might. That is because, given
the usual story for the modals (Definition 6.1), we get triviality:
Observation 6.5. egalitarianism implies:
might if p q C,i = 1 iff must if p q C,i = 1

Thus widescoping plus egalitarianism implies that must if p q is true
iff might(p ∧ q) is. Not even Cubs fans fall for that.
22 Strictness makes it easy to understand why negating a bare conditional sounds so much
like saying the counterexample might obtain. For more on context-dependent strictness
(of different flavors) see, e.g., Veltman 1985, von Fintel 1998a, 2001, and Gillies 2004, 2007,
2009.
23 Thus, given well-behavedness (Definition 3.2), explaining Fact 2 is easy for widescoping

egalitarians: if p q is equivalent to must (p ⊃ q) which, given well-behavedness, is

equivalent to must must (p ⊃ q). And that, in turn, is equivalent to must if p q .
4:21
A. S. Gillies
Proof. Note that might if p q C,i = 1 iff the plain conditional if p q is

true somewhere in Ci . But by Observation 6.4 the plain if is true somewhere

in Ci iff it is true everywhere in Ci . And it is true everywhere in Ci just in case
must if p q C,i = 1. That trivializes rather than explains Fact 3.

No matter the particulars, widescoping plus egalitarianism can’t predict

Fact 3.
Iffiness requires conditionals to have a structure that does not play nice
with modals. That’s because no way of resolving the relative scopes will
work.24 What causes the trouble is that the operator view requires if to mean
all. But the Facts don’t seem to allow that. If we widescope, then sometimes
that seems all right — if the modal in question happens to have universal
quantificational force. But when the modal is existential, if looks more like
conjunction than inclusion. And narrowscoping seems no better, rendering
all manner of coherent bits of discourse inconsistent.
That is pretty bad news for the operator view. True, we could save
iffiness by denying some Fact or other. (With defenders like that who needs
detractors?) Adding insult to injury: the Facts were chosen not at random but
with an eye to the competition. They are Facts that the new school restrictor
view predicts so easily hardly anyone has noticed.
7 Iffiness lost
Lewis (1975) famously argued that if s appearing in certain quantificational

constructions (under adverbs of quantification) are not properly iffy, that the
if in
24 Could we go for widescoping must-enriched indicatives and narrowscoping might-enriched
indicatives? For all we’ve said so far: yes. But that strategy faces an uphill battle. It is ad
hoc, three times over. First because there is no good reason to think we should settle for
anything less than a uniform story. Second because it is not obvious what it says we should
do when we consider ways in which the modal might be embedded. What if the modal is
can’t (a possibility modal scoped under negation) or needn’t (a universal under negation)?
(i) a. If my team doesn’t win out, they can’t win it all.

b. If the gardener didn’t do it, the culprit needn’t be the butler.
Do we widescope or narrowscope these? What principled story is there that predicts, rather
than stipulates, that the first is widescoped and the second narrowscoped? Third because
as soon as we consider epistemic modals that lie between the existential might and the
universal must — like probably and unlikely — it is doomed to failure anyway.
4:22
Iffiness
 

 Always

 
(13) Sometimes if a man owns a donkey, he beats it.

 

Never
 
is not a conditional connective with a conditional operator as its meaning

but instead acts as a non-connective whose only job is to mark an argument-
place for the adverb of quantification. The relevant structure is not some
Q-adverb scoped over a conditional nor some conditional with a Q-adverb
in its consequent, he said, but instead something like
(14) Q-adverb + if-clause + then-clause
The job of the if -clause in (13) is merely to restrict the domain over which
the adverb (unselectively) quantifies, and allegedly that restricting job is a
job that cannot be done by treating if as a conditional connective with a
conditional operator as its meaning. If Q-adverb is universal, maybe an iffy
if will work; but if it is existential, then conjunction does better. I want to set
the issue about adverbial (and adnomial, for that matter) quantifiers aside for
two reasons. First because I doubt the allegation sticks. But that is another
argument for another day.25 And second because it will do us good to focus
on simple cases.
Still, the trouble for the operator view that is center stage here does look
quite a lot like the problem Lewis pointed out. We have to make room for
interaction between if -clauses and the domains our modals quantify over.
But that interaction is tricky. That is because it looks impossible to assign
if the same conditional meaning — thereby taking its contribution to be an
iffy one — in all of our examples. Indeed, when the modal is universal a con-
ditional relation looks good; but when the modal is existential, conjunction
looks better. This is pretty much the same trouble Lewis saw for if s occurring
under adverbs of quantification, and led him to conclude that such if s do not
express operators at all (and a fortiori not conditional operators).26 Just as
with adverbial quantifiers, there is a fast and easy solution to the problem
if we get rid of the old school idea that if is a conditional connective and
plump instead for anti-iffiness. The most forceful way of putting the anti-iffy
thesis is Kratzer’s (1986: 11):
25 There are ways to get the restricting job done after all. The operator-based stories in, e.g.,
Belnap 1970, Dekker 2001, and von Fintel & Iatridou 2003 all manage.
26 For recent and more thorough-going defenses of if s-as-quantifier-restrictors see, e.g., Kratzer
1981, 1986 and von Fintel 1998b. But see Higginbotham 2003 for a dissenting view.
4:23
A. S. Gillies
The history of the conditional is the history of a syntactic

mistake. There is no two-place “if. . . then” connective in the
logical forms for natural languages. “If”-clauses are devices for
restricting the domains of various operators.
The thesis is that the relevant structure for the conditionals at issue here
is not some modal scoped over a conditional nor some conditional with a
modal in its consequent, but is instead something like
(15) modal + if-clause + then-clause
Or, closer to the way we’ve been putting things:
(16) modal(if-clause )(then-clause )
The job of the if -clause is to restrict the domain over which the modal
quantifies. So instead of searching for a conditional operator properly so
called that if contributes whether it commingles with a modal or not, we
search for an operator for if to restrict. And, for indicative conditionals,
we do not have to search far: the operators are (possibly covert) epistemic
modals.27
So it is the modals, not the if s, that take center stage. They have logical
forms along the lines of modal(p)(q), with the usual quantificational force:
Definition 7.1 (modal force, amended).

i. if defined, might (p)(q)C,i = 1 iff (Ci ∩ p) ∩ qC 6=
ii. if defined, must (p)(q)C,i = 1 iff (Ci ∩ p) ⊆ qC
This plus two assumptions gets us the now-standard and familiar restrictor
view. It easily accounts for consistency (Fact 1), if/must (Fact 2), and
if/might (Fact 3).
First assumption: assume that when there is no if -clause and so no
restrictor is explicit — as in Blue might be in the box or Yellow must be in
the box — the first argument in the lf of the modal is filled by your favorite
tautology (>). In those cases there is nothing to choose between an analysis
that follows our earlier Definition 6.1 and an analysis that follows Definition
27 Officially, our intermediate language now also goes in for a change. L had one-place modals
might and must and a two-place connective (if ·)(·). That won’t do to represent the restrictor
view. Instead, we need the two-place modals might (·)(·) and must (·)(·) and have no need
for a special conditional connective that expresses a conditional operator.
4:24
Iffiness
7.1, and so the latter generalizes the former.

Second assumption: assume that the job of if -clauses is to make a (non-
trivial) restrictor explicit. If there is no overt modal — as in a bare condi-
tional — the if restricts a covert must. Collecting the pieces:
Definition 7.2 (anti-iffiness). For any sentence S, let S 0 be its lf in our

intermediate language. Then:
i. A sentence of the form if S1 then S2 has lf:
a. modal(S10 )(R 0 ) if S20 = modal R 0
b. must (S10 )(S20 ) otherwise
ii. Truth conditions as in Definition 7.1
Return to the case of my missing marbles. Taking the if -clauses to be

restrictors in the example:
might (>)(p) ∧ might (>)(q)
must (¬q)(p)
must (¬p)(q)
It’s modals all the way down. And the modals can all be true together.
Observation 7.1 (anti-iffiness & consistency). Assume anti-iffiness (Def-

inition 7.2). And suppose, in C, that (17a) is a partitioning modal. Then the
sentences in (17) can all be true together.
Proof. I am in i and there are just two worlds compatible with the facts I
have, i and j. The first is a (p ∧ ¬q)-world, the second a (q ∧ ¬p)-world.
The restrictors in (17a) are trivial, so it is true at i iff Ci has a p-world in
it and a q-world in it; i witnesses the first conjunct, j the second. The
restricting if -clause of (17b) makes sure that the must ends up quantifying
only over the ¬q-worlds compatible with C: (17b) is true at i iff all of the
worlds Ci ∩ ¬q are p-worlds. And the only one, i, is. Similarly for the must
in (17c): it quantifies over the ¬p-worlds in Ci , checking to see that they are
all q-worlds.
It is just as easy to square this picture with if/must (Fact 2) and if/might
(Fact 3). Here are the examples with their new school lfs:
4:25
A. S. Gillies

must (p)(q)
must (p)(q)
might (p)(q)
might (>)(p ∧ q)
Observation 7.2 (anti-iffiness, if/must, & if/might). Assume anti-iffiness

(Definition 7.2). Then:
i. If S1 , then S2 ≈ If S1 , then must S2
ii. If S1 , then might S2 ≈ might [S1 and S2 ]
Proof. anti-iffiness assigns the same lf to a bare conditional like (18b) and
its must-enriched counterpart (18a): must (p)(q). It would thus be hard, and
pretty undesirable, for their truth conditions to come apart. That explains
if/must.
Now consider the if -as-restrictor analysis of the sort of examples behind
if/might in (19). If (19b) is true at i in C then Ci has a (p ∧ q)-world in it.
But then that same world must be in Ci ∩ p. It is a q-world, and that will
witness the truth of (19a) at i. Going the other direction: if (19a) is true at
i in C, then there are some q-worlds in Ci ∩ p. Any one of those will do
as a (p ∧ q)-world in Ci , and that is sufficient for (19b) to be true at i. That
explains if/might.
These explanations are easy. And, given the trouble for the operator
view, it looks like the only game in town is to say that if doesn’t express an
operator and so not an iffy operator. That stings.
8 Iffiness regained
The problem for iffiness is that there is an interaction between if -clauses

and the domains our modals quantify over. That is an interaction that seems
hard to square with the thesis that if is a binary connective with a conditional
meaning if we assume that it has the same meaning in each of the cases we
care about here.
4:26
Iffiness
But we have overlooked a possibility. We insisted that for a story to be iffy

it must say that if p q at i in C expresses some relation R between Di ∩ P
and Q, where Di ∩ P is the set of (relevant) worlds where the antecedent is
true and Q the set of worlds where the consequent is true. That is all right.
But we unthinkingly assumed that the context relevant for figuring out what
these sets of worlds are must always be C just because that was the context
as it stood when the if was issued. That was a mistake. Setting it straight
sets the record straight for old school iffiness.
The Ramsey test — the schoolyard version, anyway — is a test for when an

indicative conditional is acceptable given your beliefs. It says that if p q
is acceptable in belief state B iff q is acceptable in the derived or subordinate
state B-plus-the-information-that-p. You zoom in on the portion of B where
p is true and see whether q throughout that region. But our job is to say
something about the linguistically encoded meanings of indicatives not to
dole out epistemic advice. Still, the Ramsey test (plus or minus just a bit) can
be turned into a strict conditional story about truth-conditions.
Here’s how (in three easy steps). Step one: sentences get truth-values at
worlds in contexts. So swap C’s for B’s. Step two: embrace egalitarianism.
The worlds compatible with the context are the if -relevant worlds. These
first two steps give us a strict conditional analysis of indicatives, requiring

that if p q is true at i in C iff all the p-possibilities in Ci are possibilities
at which q is true. But truth depends on both index and context. Question:
What context is relevant for checking to see whether q is true at these
p-possibilities? Answer: The Ramseyan derived or subordinate context C-
plus-the-information-that-p, or C + p for short. That’s step three.
The Ramsey test invites us to add the information carried by the an-
tecedent to the contextually relevant stock of information C and check the
fate of the consequent. What we fans of iffiness overlooked was that this
assigns two jobs to if -clauses, and we only paid attention to one of them.
One job is the index-shifting job. The if -clause tells us to shift to various
alternative indices — the antecedent-possibilities compatible with C — to see
whether the consequent is true at them. This job is familiar and most ver-
sions of the operator view do a fine job tending to it. But there is another
job. When we add the information carried by the antecedent to C we also
add to the context relevant for figuring out whether the consequent is true.
That is the context-shifting job. The if -clause tells us to shift to an alternative
derived or subordinate state to see whether the consequent is true. We fans
of old school iffiness made the mistake of only making sure that the first job
4:27
A. S. Gillies
got done.
So far this isn’t a story about the meaning of if (much less an iffy one). It
is a blueprint for how to construct a semantics that gives a uniform and iffy
meaning to if s whether or not those if s mix and mingle with other operators.
To construct a story using it we need to take a stand on what it means to add
the information carried by an antecedent to the contextually relevant stock
of information. Taking that stand depends on the aspirations of the theory
since different constructions may depend on different sorts of contextually
available information and there is every reason to think that augmenting
information of different sorts goes by different rules. But our aspirations are
pretty modest here: how indicatives interact with epistemic modals. So we
can opt for an equally simple stand on what it means to add information to a
context.
Even before getting all the details laid out, we can see how the doubly
shifty behavior of if -clauses will be able to predict what needs predicting
about how indicatives and epistemic modals interact. The difference between
interpreting q against the backdrop of the prior context C and against the
backdrop of C + p is a difference that makes no difference if q has no context
sensitive bits in it. No wonder we missed it! But if q does have context
sensitive bits in it — like might or must, whose semantic value depends
non-trivially on C — then this is a difference that makes all the difference.
For example: consider a modal like must q. The contexts C and C + p may
well determine different sets of possibilities. Since must q depends exactly
on whether that set of possibilities has only q-worlds in it, we then get
a difference. Thus if must q is the consequent of an indicative, context-
shiftiness matters.
Here is the simplest way of constructing a semantics around the blueprint:
Definition 8.1 (iffiness + shiftiness).

i. if defined, if p q C,i = 1 iff Ci ∩ pC ⊆ qC+p

ii. C + p = λi.Ci ∩ pC
Such a story about if is iffy: if expresses a relation between relevant an-

tecedent and consequent worlds and that relation lives up to all the con-
straints we insisted on earlier. Hence if means all. And it expresses that no
matter whether it scopes over a universal modal or an existential modal or
no modal at all in the consequent. It is also doubly shifty. It is index-shifty

since the truth of if p q at i depends on the truth of the constituent q
4:28
Iffiness

at worlds other than i. It is context-shifty since the truth of if p q in C
depends on the truth of the constituent q in contexts other than C.
The if /modal interactions that were such trouble were only trouble be-
cause we forgot to keep track of the context-shifting job of if -clauses. And
doing that, even in the simple context-shifting in Definition 8.1, is enough to
make iffiness sit better with the Facts.
I know that just one of my marbles is in the box — either Red or Yel-
low — but do not know which it is. Narrowscope the modals. Then all of
these can be true together:
might p ∧ might q

if ¬q must p

if ¬p must q
Observation 8.1 (iffiness & consistency). Assume iffiness + shiftiness

(Definition 8.1). Suppose p and q partition the possibilities in C. The (nar-
rowscoped) sentences in (20) can all be true together in C.
Proof. Here is why. Suppose — for concreteness and without loss of general-
ity — that C contains just two worlds: i, a (p ∧ ¬q)-world and j, a (q ∧ ¬p)-
world. So (20a) is true at i.
Now take (20b). It is true at i in C, given iffiness + shiftiness, iff all the
possibilities in Ci ∩ ¬q are possibilities that must pC+¬q maps to true.
Thus we have to see whether the following holds:
if k ∈ Ci ∩ ¬q then must pC+¬q,k = 1
Iff this is so is (20b) true at i in C. But Ci ∩ ¬q = {i}, so we have to

see whether or not must pC+¬q,i = 1. Equivalently: the if is true at i iff
(C + ¬q)i ⊆ p. And since i is in fact a p-world the if is true at i in C. And
mutatis mutandis for (20c).
The operator view isn’t at odds with consistency after all. It is also easy
to predict if/must (Fact 2) and if/might (Fact 3). Here are the narrowscoped
analyses of the motivating examples:

if p must q
4:29
A. S. Gillies

if p q

if p might q
might (p ∧ q)
Observation 8.2 (iffiness, if/must, & if/might). Assume iffiness + shifti-

ness (Definition 8.1). Then:
i. If S1 , then S2 ≈ If S1 , then must S2
ii. If S1 , then might S2 ≈ might [S1 and S2 ]
Proof. If must q is true then so is q, no matter the world and context. So

it’s easy to see that when (21a) is true so is (21b). Now suppose (21b) is
true at i (with respect to C). Then all of the p-worlds in Ci are q-worlds
(Ci ∩ p ⊆ qC+p ). But if they are all worlds at which q is true, then i — and
so, given well-behavedness, every world in Ci — is equally a world at which
must q is true (with respect to C + p). And so (21a) is true, at i in C, if (21b)
is. That’s just what if/must requires.
if/might is no different. The noteworthy part is seeing how iffiness +
shiftiness predicts that when (22a) is true then so is (22b). Note that (22a) is
true at i (with respect to C) just in case all of the p-worlds in Ci are worlds
where might q, evaluated in C + p, is true. By well-behavedness we have
that:
if j, k ∈ Ci ∩ p then (C + p)j = (C + p)k = Ci ∩ p
If there is a q-world in (C + p)j , then might q is true throughout this set.

Since might q is an existential modal, if it is true with respect to C + p it
must also be true with respect to C. (Updating contexts with + is monotone.)
Whence it follows that the if with a commingling might is true at i iff among
the p-worlds in Ci lies a q-world. And any such q-world will do to witness
the truth of might (p ∧ q) at i in C. That’s just what if/might requires.
Indicatives play well with epistemic modals. That interaction seemed

hard to square with old school views that take if to express a conditional
operator. No way of sorting out the relative scopes between the modals and
the conditional seemed right. But that is because we mistakenly thought that
antecedents of conditionals only have one job to do. They shift the index at
which we check to see if the consequent is true. But they also contribute to the
4:30
Iffiness
context that is relevant when we do that checking. Once we let antecedents do

both their index-shifting and context-shifting jobs we can safely narrowscope
and there is no special problem posed for old school iffiness. The if in

if p modal q means the same iffy thing — inclusion! — saying that all the
(relevant) worlds where p is true are worlds where modal q is true. That’s
so whether the oopmh of modal is universal or existential or null and does
nothing to get in the way of explaining the Facts. That is something we fans
of iffiness ought to dig.28
9 What is at stake
Given the success of anti-iffiness why bother with iffiness at all? A fair
question. Given the context-shifting I’m advocating for fans of iffiness, what’s
the difference between old school and new school? Another fair question. I
owe some answers.
I make three (not wholly unrelated) claims. First, even if the shifty version
of the operator view and the basic version of the restrictor view covered the
same ground, there is still reason to explore the operator view. Second, the
views have different conceptual roots and different allegiances. Third, the
views don’t cover the same ground. I need to argue for each of these.
Suppose that — at least when it comes to accounting for data about the
sorts of constructions at issue here — there’s nothing to choose between
iffiness + shiftiness and anti-iffiness. Even under that assumption there
is reason to take this version of the operator view seriously. That is because
it is important to set the record straight. Maybe you don’t like skyhooks,
Chuck Taylors, and conditional connectives expressing iffy operators in your
lfs. It is important to know that whatever your reasons, it can’t be because
iffiness can’t be squared with the Facts about how if s and modals interact.
The Ramsey test intuition leads naturally to a story according to which
if expresses a bona fide conditional operator that captures the restricting
behavior of if -clauses. Thus the restricting behavior of if -clauses can be a
28 Before I said that I wanted to ignore issues about how this version of the operator view can
meet Lewis’s challenge about the ways if -clauses and adverbs of quantification interact,
saving that argument for another day. I want to stick to that (it really is an argument for
another day), but the general idea is straightforward. First, adjust the kinds of information
represented by a context so that we can sensibly quantify over individuals and the events
they participate in. Second, allow that quantificational domains can be restricted by material
in if -clauses — those domains play the role of the subordinate or derived context. Adverbs
of quantification appear under the conditional and have their usual denotations.
4:31
A. S. Gillies
part of, rather than an obstacle to, their expressing something iffy. That is
cool.
But what’s the real difference between the views? One view says we have
no conditional operator, just a complicated modal with a slot for a restrictor.
The other says we have a conditional operator but that its antecedent shifts
the context thereby acting like a restrictor. Tomato/tomăto, right? Wrong!
Here is one way of seeing that. Consider three indicatives:
(23) a. If Scorpio succeeds, then the end must be near.

b. If Scorpio succeeds, then the end is near.
c. If Jimbo is in detention, then Nelson might be.
Compare (23a) and (23c). The restrictor view says these have different modals
and different arguments for each of the slots in those modals. So, apart from
the fact that each is a modal expression of some flavor or other, there is
nothing much in common between the two. They are as different as Some
students smoke and All dogs bark: each is a quantificational expression of
some flavor or other. The operator view says something different. It says that,
despite their different antecedents and different consequents, they still share
a common iffy core: there is a conditional connective in common between
them and it contributes the same thing to each of the sentences it occurs in.
Or compare the must-enriched (23a) with its bare counterpart (23b). The
restrictor view says the bare indicative just is the must-enriched version
in disguise. That is how it predicts if/must (Fact 2). It thus treats bare
indicatives as a special case, dealt with by positing a covert and inaudible
necessity modal. Maybe there is reason to posit such an operator, and an
independent and principled reason to posit the necessity modal instead of an
existential one or some different modal with different quantificational force,
and maybe those reasons outweigh the cost of the positing. The operator
view adopts a very different stance here and that is what I want to point out.
It says that bare indicatives like (23b) are ordinary conditionals and their
counterparts with must-ed consequents like (23a) are ordinary conditionals
that happen to have must in their consequents. No special cases, no positing
of inaudible operators, and if/must comes out as a prediction not as a
stipulation. None of this is a knock-down argument for or against either of
the views — it’s not meant to be — but it does highlight their difference in
worldview.
All of this has been under the assumption that both the doubly shifty iffy
view and the anti-iffy restrictor view cover the same ground about how if s
4:32
Iffiness
and modals interact. But that’s not quite right.29 So far we have only worried
about how it is that a conditional sentence manages to express what might be
if such-and-such or how it manages to express what must be if such-and-such.
But conditional information can be more economically expressed than that.
We can just as well have a single conditional sentence that expresses what
must be and what might be if such-and-such.
A case in point: although I have lost my marbles, I know that some of
them — at least one of Red, Yellow, and Blue — are in the box. In fact I know
a bit more. I know that Yellow and Blue are in the same spot and so that Red
can’t be elsewhere if Yellow isn’t in the box. Another example: arriving at
the party, I’m not sure who’s there and who isn’t. I do know that Lenny goes
wherever Carl goes (but sometimes Lenny goes alone), but Monty never goes
where Lenny goes.
(24) a. If Yellow is in the box, then Red might be and Blue must be.
b. If Lenny is at the party, then Carl might be but Monty isn’t.
These are not exotic, each conditional is a true thing to say in the circum-
stances, and there is space for the iffy view and incarnations of the anti-iffy
restrictor view to differ on the truth conditions they assign to conditionals
like these — and so the two views can’t be stylistic variants.
Here is the issue: (24a) and (24b) have glosses:
(25) a. If Yellow is in the box, then Red might be and if Yellow is the box,
then Blue must be.
29 There are reasons independent of interaction with epistemic modals to think that anti-
iffiness, in its purest if -only-restricts form, can’t be the whole story. If it were, and if -clauses
and when-clauses have the same restricting behavior, then we wouldn’t expect differences in
cases like this:
(i) a. If the Cubs get good pitching and timely hitting after the break, they might win
it all.
b. When the Cubs get good pitching and timely hitting after the break, they might
win it all.
But we do detect a difference. I can say something true-if-hopeful with (ia). But (ib) passes
optimistic and heads straight for delusional. It’s hard to see where to locate the differ-
ence — whether it’s semantic or pragmatic — if the semantic contribution of if and when is
purely to mark the restrictor slot for the common operator might. (Lewis (1975) noticed
that sometimes a restricting if is odd when its corresponding restricting when is fine. But
he labeled these differences “stylistic variations”.) Some arguments along these lines are
pushed by von Fintel & Iatridou (2003).
4:33
A. S. Gillies
b. If Lenny is at the party, then Carl might be but if Lenny is at the

Party, then Monty isn’t.
These swap a single conditional with a complicated consequent for a conjunc-

tion of simple conditionals. The simple incarnation of the anti-iffy restrictor
view in Definition 7.2 says we do one thing when a conditional consequent
has an overt modal, and do another when there isn’t. But we didn’t say how
out in the open a modal must be to count as overt. Depending on what we
say, we can get divergence between the operator view and the restrictor view
for cases like these.
Assume — for now — that a modal is overt in a sentence iff it is the con-
nective featured in (the lf of) that sentence.30 Under that assumption, it
is then easy to see that the two stories come apart: iffiness + shiftiness
predicts that (24a) is equivalent to (25a) and so true (in the relevant context)
and anti-iffiness does not. That is because the consequent of (24a) isn’t
decorated with a leading modal (it’s a conjunction of modals), and so we have
to posit one. So (24a) gets an L-representation like
(26) must (p)(might (>)(q) ∧ must (>)(r ))
But the truth conditions of (26) do not match the truth conditions of (25a)
and so do not match the truth conditions of the original (24a): (26) is false in
the context as we set it up even though both (24a) and (25a) are true.
Now assume, instead, that a modal is overt iff it is pronounced — no
matter how arbitrarily deeply embedded. Then (26) isn’t the right anti-iffy
lf for (24a). Instead, we get something more sensible: (24a) and (25a) have
the same lf. There’s no in-principle problem with that.31 But what about
conditionals like (24b)? We don’t want to posit a must that outscopes the
pronounced might. So we have to posit a narrowscoped one. In order to
get the posited modal appropriately restricted — so that (24b) comes out
equivalent to (25b) — we have two obvious options. Option (i): Argue that
conditionals like those in (24) are not single conditionals at all, that they are
really conjunctions of two simple modals. That way there is no difference
at all between the conditionals in (24) and the glosses in (25). Option (ii):
Enrich our intermediate language to allow for explicit domain-restricting
variables, and provide a mechanism for the inheriting of those restrictions
30 In this sense, a modal is any (non-equivalent) stack of musts, mights, and negations.
31 Though it doesn’t come free: it puts strain on the process of assigning formulas of L to serve
as the lfs of sentences of natural language.
4:34
Iffiness
across intervening operators like conjunction. Both options are open, and
party line proponents of anti-iffiness are free to pursue them. But they do
require work. Option (i) posits movement we’d not like to have to posit, treats
conditionals with apparent conjoined consequents as yet another special
case, and describes rather than explains why the conditionals in (24) are
glossable by those in (25). Option (ii) requires more expressive resources
for L than we thought necessary and requires something over and above
the anti-iffy story as it stands to say when and how domain restriction gets
inherited over distance and across intervening operators. That’s not an
argument against this option but a description of it.32
But none of that really matters: my point was that iffiness + shiftiness
and anti-iffiness aren’t notational variants. And they are not: the iffy story
takes conditionals like (24) in perfect stride. No special cases, no positing
of inaudible operators, no stress on the parser in assigning formulas of
L to serve as the lfs of conditional sentences, no movement. We get the
right truth conditions, and we get as a prediction not a stipulation that the
conditionals in (24) are equivalent to those in (25).
10 Context and dynamics
Not every fan of old school iffiness will want to follow me this far. But there
is a cost to cutting their trip short since they must then deny or explain away
one of the Facts. Iffiness, they’ll no doubt point out, is not without its own
costs: the price of iffiness is shiftiness twice over.
I reply that there are costs and then there are costs. Embracing context-
shiftiness may be a cost, but I want to point out that it is not a new cost: it
makes the analysis here a broadly dynamic semantic account of indicatives.33
So shiftiness is a cost you may already be willing to bear. I want to (briefly)
point out how it is that this shiftiness amounts to a four-fold dynamic
perspective on modals and conditionals.
32 Something in the neighborhood of Option (ii) is developed (though not with an eye to
conjoined consequents) in von Fintel (1994). For a recent discussion see Rawlins 2008.
33 The general idea that consequents are evaluated in a subordinate or derived context is
standard in dynamic semantics — see, e.g., dynamic treatments of donkey anaphora (Groe-
nendijk & Stokhof 1991) or dynamic treatments of presupposition projection in conditional
antecedents and consequents (Heim 1992; Beaver 1999) or dynamic treatments of counter-
factuals (Veltman 2005; von Fintel 2001; Gillies 2007). But exploiting a derived context isn’t
quite a litmus test for dynamics since that is something shared by a lot of Ramsey-inspired
accounts, whether or not they count as ‘dynamic’.
4:35
A. S. Gillies
The version of the operator view I’m advocating for fans of iffiness takes
the truth of an indicative (at an index, in a context) to be doubly shifty.
That doubly shifty behavior makes the semantics dynamic in the sense that
interpretation both affects and is affected by the values of contextually

filled parameters. Whether if p q is true at i in C depends on C; the
indicative can be true at i for some choices of C and false at i for others. So

interpretation is context-dependent. Whether if p q is true at i in C also
depends on the subordinate context C + p. Interpreting the indicative in C
affects — temporarily — the context for interpreting some subparts of it. So
interpretation is also context-affecting.
This analysis is also dynamic in a second sense. It makes certain sentences
unstable — the truth-value a sentence gets in a context C is not a stable or
persistent property since it can have a different truth-value in a context C 0
that contains properly more information.
Definition 10.1 (persistence).

0
i. p is t-persistent iff pC,i = 1 and C 0 ⊆ C imply pC ,i = 1
0
ii. p is f -persistent iff pC,i = 0 and C 0 ⊆ C imply pC ,i = 0
p is persistent iff it is both t- and f -persistent.
The boolean bits are, of course, both t- and f -persistent and so persistent full-
stop. But not the modals: might, being existential, is f - but not t-persistent;
must goes the other way. And since if is a strict conditional, equivalent to a
necessity modal scoped over a material conditional, its pattern of persistence
is just like that for must.34
These two senses in which the story is dynamic are two sides of the same
coin. Together they explain how it is that the narrowscoped conditionals

if ¬p must q and if ¬q must p are consistent with the partitioning
modals in might p ∧ might q. From the fact that i ∈ if ¬p must q C and

i ∈ ¬pC it does not follow that i ∈ must qC . Indeed, with my marbles
lost, this is sure to be false at i in C since might p is true. What is true at i is
that — in the subordinate or derived context C + ¬q — must q is true. That
is allowed because must isn’t f -persistent. But that is not at odds with the
might claim. And mutatis mutandis for the other if .
34 This pattern makes the treatment of indicatives here similar in some respects to Veltman’s
(1985) data semantic treatment of indicatives. But there are important differences between

the two stories. Here’s one: if p might q is data semantically equivalent to if p q .
That won’t do given Fact 3.
4:36
Iffiness
So we have dynamics twice over. But so far none of this looks quite
like what is usually called “dynamic semantics”. In that sense of dynamics
meaning isn’t associated with truth conditions or propositions but with
context change potentials, effects on relevant states of information. Take
an information state s to be a set of worlds, and say that what a sentence
means is how its lf updates information states. That assigns to sentences
the semantic type usually reserved for programs and recipes; they express
relations between states — intuitively, the set of pairs of states such that
executing the program in the first state terminates in the second. We can
think of all sentences in this way, thereby treating them as instructions for
changing information states. Thus: the meaning of a sentence p is how it
changes an arbitrary information state. We might put that by saying the
denotation [p] applied to s results in state s 0 ; in post-fix notation s[p] = s 0 .35
Now say that p is true in s iff s[p] = s, for then the information p carries is
already present in s.36
Having gone this far, we can make good on the Ramsey test this way:
Definition 10.2 (Dynamic Iffiness).

s[ if p q ] = i ∈ s : q is true in s[p]
Some programs have as their main point to make such-and-such the case;
others to see whether such-and-such. Programs of the latter type are tests
and they either return their input state (if such-and-such) or fail (otherwise).
That is the kind of program Definition 10.2 says if is.37 It says an if tests
s to see whether the consequent is true in s[p]. But — in good Ramseyian
spirit — s[p] is just the subordinate context got by hypothetically adding p
to s. Truth isn’t persistent here, either. That is because a state may pass a
test posed by an existential (Are there p-possibilities?) and yet have
35 For the fragment without if s the updates are as you would expect (Veltman 1996). For the
if -free fragment of L, define [·] as follows:

i. s[patomic ] = i ∈ s : i(patomic ) = 1
ii. s[¬p] = s \ s[p]
iii. s[p ∧ q] = s[p][q]

iv. s[might p] = i ∈ s : s[p] 6=
It then follows straightaway that — for the if - and modal-free fragment — s[p] = s ∩ p.
36 This generalizes the plain vanilla story about satisfaction we were taught when first learning
propositional logic: as the story usually goes, a boolean p is true relative to a set of
possibilities s iff all the possibilities in s are in p. But that is equivalent to saying that
adding p to the information in s produces no change: s ∩ p = s iff s ⊆ p.
37 See, e.g., Gillies 2004.
4:37
A. S. Gillies
some narrower, less uncertain state fail it (No more p-possibilities!).

And dually for the universal must and if .
An iffy account like the one in Definition 10.2 is dynamic in this third
sense. But the doubly-shifty operator view iffiness + shiftiness doesn’t
look much like a dynamic semantics in that sense. That analysis looks static,
assigning truth-conditions to indicatives at a world in a context. And we can
recover propositions if the mood strikes us. But the two stories are in fact
the same: lack of persistence plus the global behavior of the modals and
if s in the doubly shifty story make it equivalent to a dynamic story of the
indicative that dispenses with the assignment of propositions of the normal
sort from the beginning.38 Even though I told the story about truth-values
assigned at contexts and indices, it is equivalent to a story about changing
information states. So we have dynamics thrice over.
We have gotten this far, and found ways to predict the Facts about how
indicatives and epistemic modals interact, without taking a stand on when
one sentence entails another. (Having said nothing about entailment we
couldn’t have said anything about modus ponens either.) Entailment is
usually taken to be preservation of truth at a point of evaluation: iff q is
true at a point if p1 , . . . , pn are all true at that point do the latter entail
the former. Not necessarily so in a dynamic semantics. Often enough,
what is important and what an entailment relation ought to capture is not
preservation of truth but preservation of information flow — what must be
true after adding the information carried by the premises. That is an update-
to-test entailment relation.39 Similarly, since the story as I have told it turns
out to be a dynamic one, we ought to expect a larger menu of options for
what it takes for a collection of premises to entail a conclusion. That is
because truth is sensitive to both context and index and contexts can shift
about as we move from the pi ’s to q. To make sure entailment is sensitive
to those shifts, we shouldn’t merely require preservation of truth-at-a-point.
Instead, just as in a more explicitly dynamic set-up, we want to augment the
38 The standard benchmark for dynamics is whether the interpretation function [·] is either
non-introspective (Can it be that s[p] 6⊆ s?) or non-continuous (Can it be that s[p] 6=
S
i∈s {i} [p]?). In set-ups like the one in Definition 10.2, the behavior of indicatives is not
continuous. See Gillies 2009 for the details on how the iffy story as I have put it is equivalent
to a more directly dynamically iffy semantics, and how the right notions of entailment
coincide in the two set-ups.
39 For more about the space of options for entailment relations in dynamic semantics see van
Benthem 1996 and Veltman 1996. Update-to-test entailment is a lot like Stalnaker’s (1975)
notion of reasonable inference.
4:38
Iffiness
context with the information of the premises, evaluating q not in C but in

(C + p1 ) + · · · + pn ). And that corresponds exactly to the dynamic update-to-
test entailment relation over our language L. That is the fourth way in which
the semantics here is dynamic.
So the doubly shifty behavior of indicatives reflects this four-fold dynamic
perspective. That is useful to know for two reasons. First because it makes
clear what the costs of iffiness are and it makes clear that some of those costs
are not completely new. Second because it makes clear that the dynamic
perspective on modals and conditionals is broader than we may have thought.
The senses in which the story here reflects a dynamic perspective are familiar
senses, but the mechanisms of that iffy story aren’t the usual mechanisms in
a dynamic semantics. The semantics traffics in things like truth conditions
and propositions, not in things like support or programs or context change
potentials. So nothing in the dynamic perspective on modals and conditionals
requires the latter sort of semantic trafficking at the expense of the former
sort. It’s broader than that.
11 An iffy upshot
My preferred version of the operator view says that an indicative is a doubly-

shifty strict conditional over sets of live possibilities. It assigns two jobs to
if -clauses. They have the index-shifting job of shifting the point at which
we check for a consequent’s truth, but they also have the context-shifting
job of shifting the context relevant for deciding at such a point whether a
consequent is true. That is how if can mean the same iffy thing no matter
whether the consequent is modal, and no matter the quantificational force of
that modal, without running afoul of the Facts.
We began with the iffy thesis that conditional information is information
of a conditional. Then we showed that — given some broad constraints for
what counts as a conditional operator properly so called — apparently no
operator view could be squared with the Facts since no way of sorting out
the scopes would work. But all of that assumed that antecedents have no
context-shifting role. So if you want to plump for an incarnation of the
operator view, and you want to square your story with the Facts, you had
better allow for context-shifting.
It’s easy to get the idea that how if s and operators like epistemic modals
interact is an argument for anti-iffiness. But since some iffy stories — this
one! — can account for that data, that’s not right. Nothing about shiftiness
4:39
A. S. Gillies
rules out anti-iffiness, of course. And so it’s open to go for a restrictor

view that co-opts context-shifting to account for the way that conditionals
with conjoined consequents turn out equivalent to conjunctions of simpler
conditionals. So if you want to toe the anti-iffy line, you might want to allow
for context-shifting anyway. Of course, that makes toeing the line a bit like
not toeing the line.
References
Adams, Ernest W. 1975. The logic of conditionals. Dordrecht: Reidel.

Beaver, David. 1999. Presupposition accommodation: A plea for common
sense. In Larry Moss, Jonathan Ginzburg & Martin de Rijk (eds.), Logic,
language, and information vol. 2, 21–44. Stanford, CA: CSLI Publications.
https://webspace.utexas.edu/dib97/itallc.pdf.
Belnap, Nuel D. 1970. Conditional assertion and restricted quantification.
Noûs 4(1). 1–12. doi:10.2307/2214285.
Bennett, Jonathon. 2003. A philosophical guide to conditionals. Oxford
University Press.
van Benthem, Johan. 1986. Essays in logical semantics (Studies in Linguistics
and Philosophy 29). Dordrecht: Reidel.
van Benthem, Johan. 1996. Exploring logical dynamics. Stanford, CA: CSLI
Publications.
Dekker, Paul. 2001. On if and only. Semantics and Linguistics Theory [SALT]
11. 114–133. http://staff.science.uva.nl/~pdekker/Papers/OIAO.pdf.
Edgington, Dorothy. 1995. Conditionals. Mind 104(414). 235–329.
doi:10.1093/mind/104.414.235.
Edgington, Dorothy. 2008. Conditionals. In Edward N. Zalta (ed.), The Stanford
encyclopedia of philosophy, Winter 2008 edn. http://plato.stanford.edu/
archives/win2008/entries/conditionals/.
von Fintel, Kai. 1994. Restrictions on quantifier domains. Amherst, MA:
University of Massachusetts dissertation. http://semanticsarchive.net/
Archive/jA3N2IwN/fintel-1994-thesis.pdf.
von Fintel, Kai. 1997. Bare plurals, bare conditionals, and only. Journal of
Semantics 14(1). 1–56. doi:10.1093/jos/14.1.1.
von Fintel, Kai. 1998a. The presupposition of subjunctive conditionals. In Uli
Sauerland & Orin Percus (eds.), The interpretive tract (MIT Working Papers
in Linguistics 25), 29–44. http://mit.edu/fintel/fintel-1998-subjunctive.
pdf.
4:40
Iffiness
von Fintel, Kai. 1998b. Quantifiers and if -clauses. Philosophical Quarterly

48(191). 209–214. doi:10.1111/1467-9213.00095.
von Fintel, Kai. 2001. Counterfactuals in a dynamic context. In Michael
Kenstowicz (ed.), Ken Hale: A life in language, 123–152. Cambridge, MA:
MIT Press.
von Fintel, Kai. 2009. Conditionals. Ms, to appear in Seman-
tics: An international handbook of meaning, edited by Klaus von
Heusinger, Claudia Maienborn, and Paul Portner. http://mit.edu/fintel/
fintel-2009-hsk-conditionals.pdf.
von Fintel, Kai & Anthony S. Gillies. 2007. An opinionated guide to epistemic
modality. In Tamar Szabó Gendler & John Hawthorne (eds.), Oxford studies
in epistemology: Volume 2, 32–62. Oxford University Press.
von Fintel, Kai & Anthony S. Gillies. 2008a. CIA leaks. The Philosophical
Review 117(1). 77–98. doi:10.1215/00318108-2007-025.
von Fintel, Kai & Anthony S. Gillies. 2008b. Might made right. In Brian
Weatherson & Andy Egan (eds.), Epistemic modals, Oxford University Press
(to appear). http://rci.rutgers.edu/~thony/fintel-gillies-2008-mmr.pdf.
von Fintel, Kai & Anthony S. Gillies. 2010. Must... stay... strong! Natural Lan-
guage Semantics to appear. http://mit.edu/fintel/fintel-gillies-2010-mss.
pdf.
von Fintel, Kai & Irene Heim. 2007. Intensional semantics. Lecture Notes, MIT.
http://tinyurl.com/intensional.
von Fintel, Kai & Sabine Iatridou. 2003. If and when if -clauses can restrict
quantifiers. Manuscript, MIT. http://web.mit.edu/fintel/www/lpw.mich.
pdf.
Geurts, Bart. 2005. Entertaining alternatives: Disjunctions as modals. Natural
Gibbard, Allan. 1981. Two recent theories of conditionals. In William L.
Harper, Robert Stalnaker & Glenn Pearce (eds.), Ifs, 211–248. Dordrecht:
Reidel.
Gillies, Anthony S. 2004. Epistemic conditionals and conditional epistemics.
Noûs 38(4). 585–616. doi:10.1111/j.0029-4624.2004.00485.x.
Gillies, Anthony S. 2007. Counterfactual scorekeeping. Linguistics and Philos-
ophy 30(3). 329–360. doi:10.1007/s10988-007-9018-6.
Gillies, Anthony S. 2009. On truth-conditions for if (but not quite only if ).
The Philosophical Review 118(3). 325–349. doi:10.1215/00318108-2009-00.
Grice, Paul. 1989. Indicative conditionals. In Studies in the way of words,
58–85. Cambridge, MA: Harvard University Press.
4:41
A. S. Gillies
Groenendijk, Jeroen & Martin Stokhof. 1991. Dynamic predicate logic. Lin-
guistics and Philosophy 14(1). 39–100. doi:10.1007/BF00628304.
Heim, Irene. 1992. Presupposition projection and the semantics of attitude
verbs. Journal of Semantics 9(3). 183–221. doi:10.1093/jos/9.3.183.
Higginbotham, James. 2003. Conditionals and compositionality. Philosophical
Perspectives 17(1). 181–194. doi:10.1111/j.1520-8583.2003.00008.x.
Jackson, Frank. 1987. Conditionals. Oxford University Press.
Kratzer, Angelika. 1981. The notional category of modality. In Hans-Jurgen
Eikmeyer & Hannes Rieser (eds.), Words, worlds, and contexts: New ap-
proaches in word semantics (Research in Text Theory 6), 38–74. Berlin: de
Gruyter.
Kratzer, Angelika. 1986. Conditionals. Proceedings of the Chicago Linguistics
Society [CLS] 22(2). 1–15.
Lewis, David. 1973. Counterfactuals. Cambridge, MA: Harvard University
Press.
Lewis, David. 1975. Adverbs of quantification. In Edward Keenan (ed.), Formal
semantics of natural language, 3–15. Cambridge University Press.
Lewis, David. 1976. Probabilities of conditionals and conditional probability.
The Philosophical Review 85(3). 297–315. doi:10.2307/2184045.
Rawlins, Kyle. 2008. (Un)Conditionals. Santa Cruz, CA: UC Santa Cruz disser-
tation.
Stalnaker, Robert. 1968. A theory of conditionals. In Nicholas Rescher (ed.),
Studies in logical theory (American Philosophical Quarterly Monograph
Series 2), 98–112. Blackwell.
Stalnaker, Robert. 1975. Indicative conditionals. Philosophia 5(3). 269–286.
doi:10.1007/BF02379021.
Veltman, Frank. 1985. Logics for conditionals. Amsterdam: University of
Amsterdam dissertation.
Veltman, Frank. 1996. Defaults in update semantics. Journal of Philosophical
Logic 25(3). 221–261. doi:10.1007/BF00248150.
Veltman, Frank. 2005. Making counterfactual assumptions. Journal of Se-
mantics 22(2). 159–180. doi:10.1093/jos/ffh022.
Anthony S. Gillies
Department of Philosophy
Rutgers University
thony@rci.rutgers.edu
4:42
doi: 10.3765/sp.3.9
Cross-linguistic variation in modality systems:

The role of mood∗
Lisa Matthewson
University of British Columbia
Received 2009-07-14 / First Decision 2009-08-20 / Revision Received 2010-02-01 /

Accepted 2010-03-25 / Final Version Received 2010-05-31 / Published 2010-08-06
Abstract The St’át’imcets (Lillooet Salish) subjunctive mood appears in nine

distinct environments, with a range of semantic effects, including weakening
an imperative to a polite request, turning a question into an uncertainty
statement, and creating an ignorance free relative. The St’át’imcets subjunc-
tive also differs from Indo-European subjunctives in that it is not selected by
attitude verbs. In this paper I account for the St’át’imcets subjunctive using
Portner’s (1997) proposal that moods restrict the conversational background
of a governing modal. I argue that the St’át’imcets subjunctive restricts the
conversational background of a governing modal, but in a way which obli-
gatorily weakens the modal’s force. This obligatory modal weakening — not
found with Indo-European non-indicative moods — correlates with the fact
that St’át’imcets modals differ from Indo-European modals along the same
dimension. While Indo-European modals typically lexically encode quantifi-
cational force, but leave conversational background to context, St’át’imcets
modals encode conversational background, but leave quantificational force
to context (Matthewson, Rullmann & Davis 2007, Rullmann, Matthewson &
Davis 2008).
Keywords: Subjunctive, mood, irrealis, modals, imperatives, evidentials, questions,

free relatives, attitude verbs, Salish
∗ I am very grateful to St’át’imcets consultants Carl Alexander, Gertrude Ned, Laura Thevarge,
Rose Agnes Whitley and the late Beverley Frank. Thanks to David Beaver, Henry Davis,
Peter Jacobs, the members of the UBC Pragmatics Research Group (Patrick Littell, Meagan
Louie, Scott Mackie, Tyler Peterson, Amélia Reis Silva, Hotze Rullmann and Ryan Waldie),
three anonymous reviewers, and audiences at New York University, the University of British
Columbia and the 44th International Conference on Salish and Neighbouring Languages
for helpful feedback and discussion. Thanks to Tyler Peterson for helping prepare the
manuscript for publication. This research is supported by SSHRC grants #410-2005-0875
and #410-2007-1046.
©2010 Lisa Matthewson

Lisa Matthewson
1 Introduction
Many Indo-European languages possess both modals, lexical items which

quantify over possible worlds, and subjunctive moods, agreement paradigms
which usually require a licensing modal element. The contrast is illustrated
for Italian in (1)–(2). (1) contains modal auxiliaries; (2) contains subjunctive
mood agreement which is licensed by the matrix attitude verb.
(1) a. deve essere nell’ ufficio

must+3sg+pres+ind be in.the office
‘He must be in the office.’ (Italian; Palmer 2006: 102)
b. puo essere nell’ ufficio
may+3sg+pres+ind be in.the office
‘He may be in the office.’ (Italian; Palmer 2006: 102)
(2) dubito che impari

I.doubt that learn+3sg+pres+sbjn
‘I doubt that he’s learning.’ (Italian; Palmer 2006: 117)
Previous work on the Salish language St’át’imcets (a.k.a. Lillooet; see

Matthewson et al. 2007, Rullmann et al. 2008, and Davis, Matthewson & Rull-
mann 2009) has established the existence of a set of modals in this language,
which differ in their semantics from those of Indo-European. Indo-European
modals typically lexically encode distinctions of quantificational force, but
leave conversational background (in the sense of Kratzer 1981, 1991) up to
context. (1a), for example, unambiguously expresses necessity, while (1b)
unambiguously expresses possibility. However, both modals allow either
epistemic or deontic interpretations, depending on context. In contrast,
modals in St’át’imcets lexically encode conversational background, but leave
quantificational force up to context. (3a), for example, is unambiguously epis-
temic, but is compatible with either a necessity or a possibility interpretation,
depending on context. (3b) is unambiguously deontic, but similarly allows
differing quantificational strengths. See Matthewson et al. 2007, Rullmann
et al. 2008, and Davis et al. 2009 for extensive discussion.1
1 All St’át’imcets data are from primary fieldwork unless otherwise noted. Data are presented
in the practical orthography of the language developed by Jan van Eijk; see van Eijk &
Williams 1981. Abbreviations: adhort: adhortative, caus: causative, circ: circumstantial
modal, col: collective, comp: complementizer, cond: conditional, conj: conjunctive,
counter: counter to expectations, deic: deictic, deon: deontic, demon: demonstrative, det:
9:2
Cross-linguistic variation in modality systems: The role of mood
(3) a. wá7=k’a s-t’al l=ti=tsítcw-s=a

be=epis stat-stop in=det=house-3sg.poss=exis
s=Philomena
nom=Philomena
‘Philomena must / might be in her house.’ only epistemic
b. lán=lhkacw=ka áts’x-en ti=kwtámts-sw=a
already=2sg.subj=deon see-dir det=husband-2sg.poss=exis
‘You must / can / may see your husband now.’ only deontic
A simplified table representing the difference between the two types of

modal system is given in Table 1:
quantificational conversational
force background
Indo-European lexical context

St’át’imcets context lexical
Table 1 Indo-European vs. St’át’imcets modal systems
In this paper I extend the cross-linguistic comparison to the realm of

mood. I argue that St’át’imcets possesses a subjunctive mood, and show that
it induces a range of apparently disparate semantic effects, depending on the
construction in which it appears. One example of the use of the subjunctive
is given in (4): it weakens the force of a deontic modal proposition (in a sense
to be made precise below). Other uses include turning imperatives into polite
requests, and turning questions into statements of uncertainty (cf. van Eijk
1997 and Davis 2006).
(4) a. gúy’t=Ø=ka ti=sk’úk’wm’it=a

sleep=3indic=deon det=child=exis
‘The child should sleep.’
determiner, dir: directive transitivizer, ds: different subject, epis: epistemic, erg: ergative,
exis: assertion of existence, foc: focus, fut: future, impf: imperfective, inch: inchoative,
indic: indicative, infer: inferential evidential, irr: irrealis, loc: locative, mid: middle
intransitive, nom: nominalizer, obj: object, prt: particle, pass: passive, perc.evid: perceived
evidence, pl: plural, poss: possessive, prep: preposition, real: realis, red: redirective
applicative, rem.past: remote past, sbjn: subjunctive, sg: singular, sim: simultaneous, stat:
stative, temp.deic: temporal deictic, ynq: yes-no question. The symbol - marks an affix
boundary and = marks a clitic boundary.
9:3
Lisa Matthewson
b. guy’t=ás=ka ti=sk’úk’wm’it=a
sleep=3sbjn=deon det=child=exis
‘I hope the child sleeps.’
I will show that the St’át’imcets subjunctive differs markedly from Indo-
European subjunctives, both in the environments in which it is licensed, and
in its semantic effects. I propose an analysis of the St’át’imcets subjunctive
which adopts insights put forward by Portner (1997, 2003). For Portner,
moods in various Indo-European languages place restrictions on the con-
versational background of a governing modal. I argue that the St’át’imcets
subjunctive mood can be analyzed within exactly this framework, with the
twist that in St’át’imcets, the restriction the subjunctive places on the gov-
erning modal obligatorily weakens the force of the proposition expressed.
This has an interesting consequence. While we can account for the
St’át’imcets subjunctive using the same theoretical tools as for Indo-European,
at a functional level the two languages are using their mood systems to
achieve quite different effects. In particular, St’át’imcets uses its mood sys-
tem to restrict modal force — precisely what this language does not restrict
via its lexical modals. At a functional level, then, we find the same kind of
cross-linguistic variation in the domain of mood as we do with modals. This
idea is illustrated in the simplified typology in Table 2:
lexically restrict lexically restrict

quant. force convers. background
Indo-European modals moods

St’át’imcets moods modals
Table 2 Modal and mood systems
These results suggest that while individual items in the realm of mood and
modality lexically encode different aspects of meaning, the systems as a
whole have very similar expressive power.
The structure of the paper: Section 2 introduces the St’át’imcets subjunc-
tive data. I first illustrate the nine different uses of the relevant agreement
paradigm, and then argue that this agreement paradigm is a subjunctive,
rather than an irrealis mood. Section 3 shows that the St’át’imcets sub-
junctive is not amenable to existing analyses of more familiar languages.
9:4
Section 4 reviews the basic framework adopted, that of Portner (1997), and
Section 5 provides initial arguments for adopting a Portner-style approach
for St’át’imcets. Section 6 presents the formal analysis, and Section 7 applies
the analysis to a range of uses of the subjunctive. Section 8 concludes and
raises some issues for future research.
2 St’át’imcets subjunctive data
St’át’imcets possesses a complex system of subject and object agreement.

There are different subject agreement paradigms for transitive vs. intransi-
tive predicates. For intransitive predicates, there are three distinct subject
paradigms, one of which is glossed as ‘subjunctive’ by van Eijk (1997) and
Davis (2006).2
indicative subjunctive
indicative nominalized
1sg tsút=kan n=s=tsut tsút=an

2sg tsút=kacw s=tsút=su tsút=acw
3sg tsut=Ø s=tsút=s tsút=as
1pl tsút=kalh s=tsút=kalh tsút=at
2pl tsút=kal’ap s=tsút=lap tsút=al’ap
3pl tsút=wit s=tsút=i tsút=wit=as
Table 3 Subject agreement paradigms for the intransitive predicate tsut

‘to say’ (adapted from van Eijk 1997: 146)
With transitive predicates, the situation is similar, except that there are
four separate paradigms, one of which is subjunctive.3,4
2 The cognate forms are often called ‘conjunctive’ in other Salish languages, primarily in order
to disambiguate the abbreviations for ‘subject’ and ‘subjunctive’. See for example Kroeber
1999.
3 The traditional terms for the first two columns are ‘indicative’ and ‘nominalized’ respectively.
The nominalized endings are identical to nominal possessive endings, and are glossed as
‘poss’ in the data. The choice between these first two paradigms is syntactically governed: the
so-called ‘indicative’ surfaces in matrix clauses and relative clauses, while the nominalized
paradigm appears in subordinate clauses. Both these sets contrast semantically, in all
syntactic environments, with the subjunctive, hence my overall categorization of the first
two paradigms as ‘indicative’.
4 See Kroeber 1999 and Davis 2000 for justification of the analysis of subject inflection
9:5
Lisa Matthewson
In subsection 2.1 I illustrate the uses of the paradigms glossed as sub-

junctive, and in subsection 2.2 I argue that these paradigms more closely
approximate familiar subjunctives, rather than irrealis moods.
2.1 Uses of the St’át’imcets subjunctive
The mood I am glossing as ‘subjunctive’ has a wide range of uses, which

at first glance are not easily unifiable. I illustrate all of them here. First,
the subjunctive functions to turns a plain assertion into a wish (Davis 2006:
chapter 24).5
(5) a. nilh s=Lémya7 ti=kél7=a

foc nom=Lémya7 det=first=exis
‘Lémya7 is first.’
b. nílh=as s=Lémya7 ku=kéla7
foc=3sbjn nom=Lémya7 det=first
‘May Lémya7 be first.’
(6) a. ámh=as ku=scwétpcen-su!

good=3sbjn det=birthday=2sg.poss
‘May your birthday be good!’
b. ámh=as ku=s=wá7=su!
good=3sbjn det=nom=be=2sg.poss
‘Best wishes!’ [‘May your being be good.’] (Davis 2006: ch. 24)
This use of the subjunctive is very restricted (see van Eijk 1997: 147).
Minimal pairs cannot usually be constructed for ordinary assertions, as
shown in (7)–(9).
(7) a. kwis lhkúnsa

rain today
‘It’s raining today.’
b. *kwís=as lhkúnsa
rain=3sbjn today
intended: ‘May it rain today.’
assumed here. I do not provide the transitive paradigms, as subject markers vary based on
the person and number of the object and the table is excessively large. See van Eijk 1997 and
Davis 2006 for details.
5 The determiner alternation between (5a) and (5b) (ti=. . . =a vs. ku=) is predictable, but
irrelevant for current concerns. See Matthewson 1998, 1999 for discussion.
9:6
(8) a. áma ti=sq’ít=a

good det=day=exis
‘It is a good day.’
b. *ámh=as ti=sq’ít=a
good=3sbjn det=day=exis
intended: ‘May it be a good day.’
(9) a. guy’t ti=sk’úk’wm’ita

sleep det=child=exis
‘The child is sleeping.’
b. *guy’t=ás ti=sk’úk’wm’ita
sleep=3sbjn det=child=exis
intended: ‘I hope the child sleeps.’
In general, the subjunctive seems only to add to a plain assertion either

in a cleft structure, as in (5), or in conventionalized wishes, as in (6). I return
to this issue below.
The more usual case of the subjunctive creating a wish-statement is when
it co-occurs with the deontic modal ka, as in (10)–(11).
(10) a. plan=ka=tí7=t’u7 wa7 máys-n-as

already=deon=demon=prt impf fix-dir-3erg
‘He should have fixed that already.’
b. plan=as=ká=tí7=t’u7 wa7 máys-n-as
already=3sbjn=deon=demon=prt impf fix-dir-3erg
‘I wish he had fixed that already.’
(11) a. gúy’t=ka ti=sk’úk’wm’it=a

sleep=deon det=child=exis
‘The child should sleep.’
b. gúy’t=ás=ka ti=sk’úk’wm’it=a
When used with the deontic modal ka, in addition to the ‘wish’ interpre-
tation shown in (10)–(11), the subjunctive can also render a ‘pretend to be ...’
interpretation.6
6 The data in (12) are from the Upper St’át’imcets dialect; in Lower St’át’imcets, (12a) is
corrected to (i), which has the subjunctive but lacks the deontic modal. This independent
9:7
Lisa Matthewson
(12) a. skalúl7=acw=ka: saq’w knáti7 múta7 em7ímn-em

owl=2sg.sbjn=deon fly deic and animal.noise-mid
‘Pretend to be an owl: fly around and hoot.’
(Davis 2006: chapter 24)
b. snu=hás=ka ku=skícza7
2sg.emph=3sbjn=deon det=mother
‘Pretend to be the mother.’
(Whitley, Davis, Matthewson & Frank (editors) no date)
The fourth construction which licenses the subjunctive is the imperative;

the subjunctive weakens an imperative to a polite request (Davis 2006:
chapter 24). In each of (13)–(15), the subjunctive imperative in (b) is construed
as ‘more polite’ than the plain imperative in (a). The subjunctive is particularly
common in negative requests, as in (15).
(13) a. lts7á=malh lh=kits-in’=ál’ap!

deic=adhort comp=put.down-dir=2pl.sbjn
‘Just put it over here!’
b. lts7á=has=malh lh=kits-in’=ál’ap
deic=3sbjn=adhort comp=put.down-dir=2pl.sbjn
‘Could you put it down here?’/‘You may as well put it down over
here.’7 (adapted from Davis 2006: chapter 24)
(14) a. nás=malh áku7 pankúph=a

go=adhort deic Vancouver=exis
‘You’d better go to Vancouver.’
b. nás=acw=malh áku7 pankúph=a
go=2sg.sbjn=adhort deic Vancouver=exis
‘You could go to Vancouver.’
pronoun construction is argued by Thoma (2007) to be a concealed cleft. I return to this
issue below.
(i) nu=hás ku=kalúla7

2sg.emph=3sbjn det=owl
‘Pretend to be an owl.’
7 The third person subjunctive ending appears here because the structure is bi-clausal,
involving a third-person impersonal main predicate: ‘It is here that you could put it down.’
9:8
(15) a. cw7aoz kw=s=sek’w-en-ácw ta=nk’wanústen’=a

neg det=nom=break-dir-2sg.erg det=window=exis
‘Don’t break the window.’
b. cw7áoz=as kw=s=sek’w-en-ácw ta=nk’wanústen’=a
neg=3sbjn det=nom=break-dir-2sg.erg det=window=exis
‘Don’t break the window.’
Fifth, in combination with an evidential or a future modal, the subjunctive

helps to turn wh-questions into statements of uncertainty or wondering.
(16) a. kanem=lhkán=k’a
do.what=1sg.indic=infer
‘What happened to me?’
b. kanem=án=k’a
do.what=1sg.sbjn=infer
‘I don’t know what happened to me.’ / ‘I wonder what I’m doing.’8
(17) a. kanem=lhkácw=kelh múta7

do.what=2sg.indic=fut again
‘What are you going to be doing later?’
b. kanem=ácw=kelh múta7
do.what=2sg.sbjn=fut again
‘I wonder what you are going to do again.’ (van Eijk 1997: 215)
(18) a. nká7=kelh lh=cúz’=acw nas

where=fut comp=going.to=2sg.sbjn go
‘Where will you go?’
b. nká7=as=kelh lh=cúz’=acw nas
where=3sbjn=fut comp=going.to=2sg.sbjn go
‘Wherever will you go?’ / ‘I wonder where you are going to go
now.’ (adapted from Davis 2006: chapter 24)
The same effect arises with yes-no questions. In combination with the evi-
dential k’a or a future modal, the subjunctive also turns these into statements
of uncertainty which are often translated using ‘maybe’ or ‘I wonder’.
8 For expository reasons, k’a was glossed as ‘epistemic’ in (3a) above, but from now on will be
glossed as ‘inferential’. Matthewson et al. (2007) analyze k’a as an epistemic modal which
carries a presupposition that there is inferential evidence for the claim.
9:9
Lisa Matthewson
(19) a. lán=ha kwán-ens-as

already=ynq take-dir-3erg
ni=n-s-mets-cál=a
det.abs=1sg.poss-nom-write-act=exis
‘Has she already got my letter?’
b. lan=as=há=k’a kwán-ens-as
already=3sbjn=ynq=infer take-dir-3erg
ni=n-s-mets-cál=a
‘I wonder if she’s already got my letter.’/’I don’t know if she got
my letter or not.’
(20) wa7=as=há=k’a tsicw
impf=3sbjn=ynq=infer get.there
i=n-sésq’wez’=a, cw7aoz kw=en
det.pl=1sg.poss-younger.sibling=exis neg det=1sg.poss
zwát-en
know-dir
‘Perhaps my younger siblings went along, I don’t know.’
(Matthewson 2005: 265)
In combination with a wh-indefinite and the evidential k’a, the subjunctive
creates free relatives with an ‘ignorance/free choice’ reading; see Davis 2006
for discussion.
(21) a. qwatsáts=t’u7 múta7 súxwast áku7, t’ak aylh áku7,
leave=prt again go.downhill deic go then deic
nílh=k’a s=npzán-as
foc=infer nom=meet(dir)-3erg
k’a=lh=swát=as=k’a káti7 ku=npzán-as
infer=comp=who=3sbjn=infer deic det=meet(dir)-3erg
‘So he set off downhill again, went down, and then he met who-
ever he met.’ (van Eijk & Williams 1981: 66, cited in Davis 2009)
b. o, púpen’=lhkan [ta=stam’=as=á=k’a]
oh find=1sg.indic [det=what=3sbjn=exis=infer]
‘Oh, I’ve found something or other.’
(Unpublished story by “Bill” Edwards, cited in Davis 2009)
When used in combination with the scalar particle t’u7, the subjunctive
creates a statement translated as ‘might as well’ or ‘may as well’.
9:10
(22) a. wá7=lhkan=t’u7 wa7 k’wzús-em

impf=1sg.indic=prt impf work-mid
‘I am just working.’
b. wá7=an=t’u7 wa7 k’wzús-em
impf=1sg.sbjn=prt impf work-mid
‘I might as well stay and work.’
(23) a. wá7=lhkacw=t’u7 lts7a lhkúnsa ku=sgáp
be=2sg.indic=prt deic now det=evening
‘You are staying here for the night.’
b. wá7=acw=t’u7 lts7a lhkúnsa ku=sgáp
be=2sg.sbjn=prt deic now det=evening
‘You may as well stay here for the night.’
And finally, in combination with a wh-word and the scalar particle t’u7,
the subjunctive creates free relatives with a universal / indifference reading.
(24) a. wa7 táw-em ki=smán’c=a, ns7á7z’-em
impf sell-mid det.col=tobacco=exis trade-mid
ku=stám’=as=t’u7
det=what=3sbjn=prt
‘He was selling tobacco, trading it for whatever . . . ’
(van Eijk & Williams 1981: 74, cited in Davis 2009)
b. wa7 kwám=wit ku=káopi, ku=súkwa, ku=saplín,
impf take(mid)=3pl det=coffee det=sugar det=flour
[stám’=as=t’u7 cw7aoz
[what=3sbjn=prt neg
kw=s=ka-ríp-s-tum’-a
det=nom=circ-grow-caus-1pl.erg-circ
l=ti=tmícw-lhkalh=a]
on=det=land-1pl.poss=exis]
‘They got coffee, sugar, flour, whatever we couldn’t grow on our
land. . . ’ (Matthewson 2005: 105, cited in Davis 2009)
c. [stám’=as=t’u7 káti7 i=wá7
[what=3sbjn=prt deic det.pl=impf
ka-k’ac-s-twítas-a i=n-slalíl’tem=a]
circ-dry-caus-3pl.erg-circ det.pl=1sg.poss-parents=exis]
wa7 ts’áqw-an’-em lh=as sútik
impf eat-dir-1pl.erg comp(impf)=3sbjn winter
9:11
Lisa Matthewson
‘Whatever my parents could dry, we ate in wintertime.’

(Matthewson 2005: 141, cited in Davis 2009)
The nine uses of the St’át’imcets subjunctive are summarized in Table 4:
environment indicative meaning subjunctive meaning
plain assertion assertion wish

deontic modal deontic necessity/possibility wish
deontic modal deontic necessity/possibility ‘pretend’
imperative command polite request
wh-question + question uncertainty/wondering
evidential/future
yes-no question + question uncertainty/wondering
evidential/future
wh-word + evidential question ignorance free relative
scalar particle t’u7 ‘just/still’ ‘might as well’
wh-word + scalar N/A indifference free relative
particle t’u7
Table 4 Uses of the St’át’imcets subjunctive
These are all the cases where the subjunctive has a semantic effect; in
the next sub-section we will also see some cases where the subjunctive is
obligatory and semantically redundant. I will not aim to account for the entire
panoply of subjunctive effects in one paper. However, the analysis I offer
will explain the first seven uses, setting aside for future research only the
two uses which involve the particle t’u7. See Section 8 for some speculative
comments about the subjunctive in combination with t’u7.
2.2 This is a subjunctive mood
In this sub-section I justify the use of the term ‘subjunctive’ for the subject
agreements being investigated. The choice of terminology is intended to
reflect the fact that the St’át’imcets mood patterns with Indo-European sub-
junctives, rather than with Amerindian irrealis moods, in several respects.
However, we will see below that the St’át’imcets subjunctive also differs
9:12
semantically in important ways from Indo-European subjunctives.9

Palmer (2006) observes that there is a broad geographical typology, such
that European languages often encode an indicative/subjunctive distinc-
tion, while Amerindian and Papuan languages often encode a realis/irrealis
distinction. A typical irrealis-marking system is illustrated in (25).
(25) a. ho bu-busal-en age qo-in

pig sim-run.out-3sg+ds+real 3pl hit-3pl+rem.past
‘They killed the pig as it ran out.’ (Amele; Palmer 2006: 5)
b. ho bu-busal-eb age qo-qag-an
pig sim-run.out-3sg+ds+irr 3pl hit-3pl-fut
‘They will kill the pig as it runs out.’ (Amele; Palmer 2006: 5)
According to Palmer (2006: 145), the indicative/subjunctive distinction

and the realis/irrealis distinction are ‘basically the same’. The core function
of both a subjunctive and an irrealis is to encode ‘non-assertion’.10 However,
there are differences in distribution and in syntactic functions.
First, Palmer observes that subjunctive is not marked independently of
other inflectional categories such as person and number. Instead, there is
typically a full subjunctive paradigm. On the other hand, irrealis is often
marked by a single element. In this respect, the St’át’imcets mood patterns
like a subjunctive; see Table 3 above.
Second, in main clauses, irrealis marking is often used for questions,
futures and denials; this is not the case for main clause subjunctives. In this
respect also, the St’át’imcets mood patterns like a subjunctive. It is not used
to mark questions, futures or denials. (26)–(28) all have indicative marking.
9 This raises a terminological issue which arises in many areas of grammar. Should we apply
terms which were invented for European languages to similar — but not identical — categories
in other languages? For example, should we say ‘The perfect / definite determiner /
subjunctive in language X differs semantically from its English counterpart’, or should we
say ‘Language X lacks a perfect / definite determiner / subjunctive’, because it lacks an
element with the exact semantics of the English categories? I adopt the former approach
here, as I think it leads to productive cross-linguistic comparison, and because it suggests
that the traditional terms do not represent primitive sets of properties, but rather potentially
decomposable ones.
10 Palmer does not provide a definition of ‘non-assertion’. He observes that common reasons
why a proposition is not asserted are because the speaker doubts its veracity, because the
proposition is unrealized, or because it is presupposed (Palmer 2006: 3). See Section 3 below
for discussion.
9:13
Lisa Matthewson
(26) t’íq=Ø=ha kw=s=Josie?

arrive=3indic=ynq det=nom=Josie
‘Did Josie arrive?’
(27) t’íq=Ø=kelh kw=s=Josie

arrive=3indic=fut det=nom=Josie
‘Josie will arrive.’
(28) cw7aoz kw=s=t’iq=s s=Josie

neg det=nom=arrive=3poss nom=Josie
‘Josie didn’t arrive.’
Third, Palmer notes that subjunctive marking is obligatory and redundant

only in subordinate clauses, while irrealis marking is often obligatory and
redundant in main clauses. Here again, the St’át’imcets mood patterns like a
subjunctive. It is obligatory and redundant only in three cases. The first is
when embedded under the complementizer lh=. lh= is glossed by van Eijk
(1997) as ‘hypothetical’, and analyzed by Davis (2006) as a complementizer
which introduces subjunctive clauses, including if -clauses, as in (29a) and
(29b), temporal adjuncts (29b), locative adjuncts (29c), and complements to
the evidential k’a when this is used as a (focused) adverb (29d).
(29) a. lh=cw7áoz*(=as)=ka kw=s=gúy’t=su,

comp=neg*(=3sbjn)=irr det=nom=sleep=2sg.poss
lán=ka=tu7 wa7 xzum i=n’wt’ústen-sw=a
already=irr=then impf big det.pl=eye-2sg.poss=exis
‘If you hadn’t slept, your eyes would have been big already.’
(van Eijk & Williams 1981: 12)
b. xwáyt=wit=ka lh=wa7=wit*(=ás)=t’u7 qyax
many.people.die=3pl=irr comp=be=3pl*(=3sbjn)=prt drunk
múta7 tqálk’-em lh=w*(=as) qyáx=wit
and drive-mid comp=impf*(=3sbjn) drunk=3pl
‘They would die if they got drunk and drove when they were
drunk.’ (Matthewson 2005: 367)
c. lts7a lh=wa7*(=as) qwál’qwel’t
deic comp=impf*(=3sbjn) hurt
‘It is here that it is hurting.’
9:14
d. k’a lh=7án’was*(=as) sq’it,

maybe comp=two*(=3sbjn) day
ka-láx-s-as-a n-skícez7=a
circ-remember-caus-3erg-circ 1sg.poss-mother=exis
na=s-7ílacw-em-s=a
det=nom-soak-mid-3poss=exis
ta=n-qéqtsek=a
det=1sg.poss-older.brother=exis
‘Maybe two days later, my mother remembered the fish my
brother had been soaking.’
(Matthewson 2005: 152; cited in Davis 2006: chapter 23)11
The second case where the St’át’imcets subjunctive is obligatory and

redundant is when embedded under the complementizer i= ‘when’, as in (30).
i= has a similar distribution to lh=, but is restricted to past-time contexts.
See van Eijk 1997: 235-6 and Davis 2006: chapter 27 for discussion.
(30) a. i=kél7=at tsicw, áts’x-en-em

when.past=first=1pl.sbjn get.there see-dir-1pl.erg
i=cw7ít=a tsitcw
det=many=exis house
‘When we first got there, we saw lots of houses.’
b. wá7=lhkan lexláx-s i=kwís*(=as)
impf=1sg.indic remember-caus when.past=fall*(=3sbjn)
na=n-sésq’wez’=a, s=Harold Peter
det.abs=1sg.poss-younger.sibling=exis nom=Harold Peter
‘I remember when my little brother was born, Harold Peter.’
(Matthewson 2005: 354-5)
11 Incidentally, Davis (2006: chapter 23) observes that ‘two or more k’a lh= clauses strung
together form the closest equivalent in [St’át’imcets] of [English] “either...or”.’ An example is
given in (i).
(i) k’a lh=xw7utsin-qín’=as, k’a lh=tsilkst-qín’=as=kelh
maybe comp=four-animal=3sbjn maybe comp=five-animal=3sbjn=fut
‘It’ll either be a four point or a five point buck.’ (Davis 2006: chapter 23)
As Davis implies, St’át’imcets lacks any lexical item which renders logical disjunction, and
constructions like (i), although used to translate English ‘or’, are literally two ‘maybe’-clauses
strung together.
9:15
Lisa Matthewson
Finally, the subjunctive is obligatory when it appears in combination

with the perceived-evidence evidential =an’. =an’ is analyzed by Matthewson
et al. (2007) as an epistemic modal which is defined only if the speaker has
perceived indirect evidence for the prejacent proposition.
(31) a. *táyt=kacw=an’
hungry=2sg.indic=perc.evid
‘You must be hungry.’
b. táyt=acw=an’
hungry=2sg.sbjn=perc.evid
‘You must be hungry.’
(32) a. *nílh=Ø=an’ s=Sylvia ku=xílh-tal’i

foc=3indic=perc.evid nom=Sylvia det=do(caus)-top
‘Apparently it was Sylvia who did it.’
b. nílh=as=an’ s=Sylvia ku=xílh-tal’i
foc=3sbjn=perc.evid nom=Sylvia det=do(caus)-top
‘Apparently it was Sylvia who did it.’
(Matthewson et al. 2007: 208)
The perceived-evidence evidential is the only environment in the language

where the subjunctive is obligatory in a matrix clause. I assume that the
subjunctive lacks semantic import here, as an otherwise very similar evi-
dential lákw7a does not allow the subjunctive in cases parallel to (31)–(32)
(Matthewson 2010, to appear).
The conclusion is that St’át’imcets, in spite of being an Amerindian lan-
guage, has a mood which patterns, at least morpho-syntactically, like a
subjunctive rather than an irrealis. This fits with how van Eijk (1997) and
Davis (2000, 2006) gloss the relevant forms. However, we will see in the next
section that the St’át’imcets subjunctive differs semantically in interesting
ways from European subjunctives.
3 Why previous analyses do not work for St’át’imcets
The vast majority of formal research on the subjunctive deals with Indo-
European. In languages such as the Romance languages, the subjunctive
mood is used for wishes, fears, speculations, doubts, obligations, reports,
unrealized events, or presupposed propositions. Some examples are provided
in (33)–(34).
9:16
(33) a. creo que aprende

I.believe that learn+3sg+pres+indic
‘I believe that he is learning.’ (Spanish; Palmer 2006: 5)
b. dudo que aprenda
I.doubt that learn+3sg+pres+sbjn
‘I doubt that he’s learning.’ (Spanish; Palmer 2006: 5)
(34) potessi venire anch’ io

can+1sg+pres+sbjn come also I
‘If only I could come too.’ (Italian; Palmer 2006: 109)
In this section I briefly discuss some of the main approaches to the

subjunctive. I cannot do justice to the full array of proposals in the literature;
the goal is to provide enough background to establish that the St’át’imcets
subjunctive is not amenable to a range of existing approaches.
One pervasive line of thought is that subjunctive encodes a general se-
mantic contribution of ‘non-assertion’ (Bolinger 1968, Terrell & Hooper 1974,
Hooper 1975, Klein 1975, Farkas 1992, Lunn 1995, Palmer 2006, Haverkate
2002, Panzeri 2003, among others). One recent formal proposal in this line
is that of Farkas (2003). Farkas argues that there is a correlation between
indicative mood and complements which have assertive context change po-
tential relative to the embedded environment. Assertive context change for a
matrix clause is defined as in (35); the context set of worlds Wc is narrowed.
(35) Assertive context change

c + φ is assertive iff Wc 0 = Wc ∩ p, where c 0 is the output context.
(Farkas 2003: 5)
Farkas provides an analysis of assertion in embedded contexts which

predicts that positive epistemic predicates like believe or know take indicative
complements, as these complements are asserted relative to the matrix
subject’s epistemic state.12
Predicates of assertion (‘say’, ‘assert’) and of fiction (‘dream’, ‘imagine’)
similarly introduce complements which are assertively added to the embed-
ded speech context, and also take indicative complements. On the other
hand, complements to desideratives (‘want’, ‘wish’, ‘desire’) and directives
(‘command’, ‘direct’, ‘request’) are not assertive. Rather than eliminating
12 Predicates like believe take subjunctive complements in Italian; see Giorgi & Pianesi 1997,
among many others, for discussion.
9:17
Lisa Matthewson
worlds in the context set where the complement is false, these predicates
eliminate worlds in the context set which are low on an evaluative ranking.13
Thus, these predicates take the subjunctive:
(36) Maria vrea să-i răspundă

Maria wants subj-cl answer.sbjn
‘Maria wants to answer him.’ (Romanian; Farkas 2003: 2)
Giannakidou (1997, 1998, 2009) offers an alternative characterization

of the distribution of the subjunctive, according to which it appears in
nonveridical contexts, while indicative appears in veridical contexts. The
relevant definition is given in (37):
(37) A propositional operator F is veridical iff from the truth of F p we

can infer that p is true relative to some individual x (i.e., in some
individual x’s epistemic model) . . . If inference to the truth of p under
F is not possible, F is nonveridical. (Giannakidou 2009: 1889)
According to this analysis, the division between indicative-taking and

subjunctive-taking predicates relies on whether at least one epistemic agent
is committed to the truth of the embedded proposition. Giannakidou’s
approach predicts a similar division between indicative- and subjunctive-
taking predicates to Farkas’s. In Modern Greek, the indicative is found
in complements to predicates of assertion or fiction, epistemics, factives
and semi-factives. The subjunctive is found in complements to volitionals,
directives, modals, permissives, negatives, and verbs of fear (Giannakidou
2009: 9).14
An approach which aims to derive mood selection directly from the
semantics of subordinating predicates is that of Villalta (2009). Villalta argues
13 The complements of desideratives are also not ‘decided’ relative to their context set, which
is what is actually crucial here for Farkas (2003). Farkas proposes an Optimality Theory
account involving the two constraints in (i):
(i) *SUBJ/+Decided *IND/-Assert
Different rankings of these two constraints give rise to different mood choices in Romanian
vs. French for emotive factive predicates like ‘be sorry/happy’, ‘regret’. Emotive factives are
+Decided but -Assertive, and take the indicative in Romanian and the subjunctive in French.
14 Giannakidou (2009) proposes that the Modern Greek subjunctive complementizer na con-
tributes temporal semantics (introducing a ‘now’ variable). The generalization is still that
subjunctive appears in non-veridical contexts; see Giannakidou 2009 for details.
9:18
that subjunctive-selecting predicates are those whose embedded propositions

are compared to contextual alternatives on a scale encoded by the predicate.
The contribution of the subjunctive is to evaluate the contextual alternatives.
Quer (1998, 2001), looking mainly at Catalan and Spanish, argues that the
subjunctive signals a shift in the model of the evaluation of the truth of the
proposition. For unembedded assertions, the anchor is the Speaker and the
model is the epistemic model of the Speaker. Operators which introduce sub-
junctive introduce buletic models, or other models which create comparative
relations among worlds. This predicts we will find subjunctive in purpose
clauses, and predicts indicative/subjunctive alternations in restrictive rel-
ative clauses, concessives, and free relatives. Quer (2009) also discusses
indicative/subjunctive alternations in conditionals, claiming that indicative
appears in protases that are ‘realistic in the sense that they quantify over
worlds which are close enough to the actual one’ (2009: 1780). Subjunctive is
used when the worlds are further away from the actual one or even disjoint
from it.
An approach to mood which draws on notions from noun phrase se-
mantics is offered by Baker & Travis (1997). Baker and Travis argue that in
Mohawk, mood marks a division between ‘verbal specificity’ (‘factive’ mood)
and Kamp/Heim-style indefiniteness (two variants of non-factive mood, pre-
viously called the ‘future’ and the ‘optative’). Indefinite/non-factive mood
appears in future contexts, in past habituals, in negative clauses, under the
verbs ‘promise’ and ‘want’, and in free relatives with a non-specific reading.
What links all these indefinite-mood environments, according to Baker and
Travis, is the same feature that characterizes indefinite noun phrases in the
Kamp/Heim system: a free variable (in the Mohawk case, an event variable)
which undergoes existential closure in the scope of various operators.
This ends our brief tour through some major formal approaches to the
subjunctive.15 The reader is referred to Portner (2003) for further overview
and discussion. In the next sub-section I show that the St’át’imcets subjunc-
tive does not behave like the Indo-European or Mohawk subjunctives, and
that a new approach is required.
15 I defer discussion of Portner’s (1997) analysis to Section 5, since I will be adapting Portner’s
approach for St’át’imcets.
9:19
Lisa Matthewson
3.1 The St’át’imcets subjunctive is not amenable to existing approaches
The St’át’imcets subjunctive differs from familiar subjunctives in both its

distribution and semantic effects. Although there are some initial similarities,
such as the fact that both St’át’imcets and Indo-European subjunctives can be
used to express wishes and hopes, St’át’imcets mood displays no sensitivity to
the choice of matrix predicate. Thus, unlike in Romance or Greek, predicates
of assertion, belief and fiction are not differentiated from desideratives or
directives. All attitude verbs in St’át’imcets take the indicative, as illustrated
for a representative range in (38).16,17
(38) a. tsut k=Laura kw=s=t’iq=Ø k=John

say det=Laura det=nom=arrive=3indic det=John
‘Laura said that John came.’
b. tsut-ánwas k=Laura kw=s=t’iq=Ø k=John
say-inside det=Laura det=nom=arrive=3indic det=John
‘Laura thought that John came.’
c. zwát-en-as k=Laura kw=s=t’iq=Ø k=John
know-dir-3erg det=Laura det=nom=arrive=3indic det=John
‘Laura knew that John came.’
16 Interestingly, the same is not true of the related language Skwxwú7mesh (Squamish). In
Skwxwú7mesh, the subjunctive (glossed as ‘conjunctive’; see fn. 2) is obligatory under ‘tell
someone to do something’ (as in (i)), but is optional under ‘I think’, depending on whether
the speaker knows that the event did not take place (ii-iii) (all data from Peter Jacobs, p.c.).
(i) chen tsu-n-Ø-Ø mi as uys

I tell-dir-dat-3obj come 3conj come.inside
‘I told him to come inside.’
(ii) chen ta7aw’n kwi s-Ø-s mi uys

I think det nom-real-3poss come come.inside
‘I think he came inside.’
(iii) chen ta7aw’n k’-as mi uys

I think irr-3conj come come.inside
‘I thought he came inside (but then I found out that he’s still outside playing).’
Jacobs (1992) analyzes the mood distinction in Skwxwú7mesh as encoding speaker certainty,
which suggests that it differs from the St’át’imcets mood system.
17 The expected subject inflection in the embedded clauses in (38) would actually be possessive
=s; see van Eijk 1997 and Davis 2006. However, many modern speakers prefer to omit the
possessive ending and to use matrix indicative =Ø in these contexts. This does not affect
the point at hand, as the variation is between two forms of indicative marking.
9:20
d. kw7íkwl’acw k=Laura kw=s=t’iq=Ø k=John

dream det=Laura det=nom=leave=3indic det=John
‘Laura dreamt that John came.’
e. xát’-min’-as k=Laura kw=s=t’iq=Ø k=John
want-red-3erg det=Laura det=nom=arrive=3indic det=John
‘Laura wanted John to come.’
f. tsa7cw k=Laura kw=s=t’iq=Ø k=John
glad det=Laura det=nom=arrive=3indic det=John
‘Laura was happy that John came.’
g. tsún-as k=Laura k=John kw=s=ts7as=Ø
say(dir)-3erg det=Laura det=John det=nom=come=3indic
‘Laura told John to come.’18
The St’át’imcets subjunctive is also not used under negated verbs of

belief or report, as it is in many European languages (cf. Palmer 2006: 116).
Compare Spanish (39a) with St’át’imcets (39b) and (39c).
(39) a. no creo que aprenda

not I.think that learn+3sg+pres+sbjn
‘I don’t think that he is learning.’ (Spanish; Palmer 2006: 117)
b. cw7aoz kw=en=tsut-ánwas kw=s=zwátet-cal=s
neg det=1sg.poss=say-inside det=nom=know-act=3poss
‘I don’t think that he is learning.’
c. cw7aoz kw=s=tsut=s kw=s=Aggie
neg det=nom=say=3poss det=nom=Aggie
kw=s=t’cum=s i=gáp=as
det=nom=win=3poss when.past=evening=3sbjn
‘Aggie didn’t say she won last night.’
Nor does the St’át’imcets subjunctive give rise to interpretive differ-

ences inside relative clauses. In some Indo-European languages, an indica-
tive/subjunctive contrast in restrictive relatives gives rise to a distinction
which has variously been analyzed as referential/attributive, specific/non-
specific, or wide-scope/narrow-scope (see Rivero 1975, Farkas 1992, Giannaki-
dou 1997, Beghelli 1998, Quer 2001, among many others). This is illustrated in
18 The predicate in (38g) differs from that in (38a)–(38f) because the ‘ordering’ environment in
(38g) requires an unergative embedded verb.
9:21
Lisa Matthewson
(40) for Catalan. Quer’s analysis of these examples involves a shifting of the
model in which the descriptive condition in the relative clause is interpreted;
the effect is one of apparent ‘wide-scope’ for the descriptive condition in the
indicative (40a), as opposed to in the subjunctive (40b).
(40) a. necessiten un alcalde [que fa grans

need.3pl a mayor that make.indic.prs.3sg big
inversions]
investments
‘They need a mayor that makes big investments.’
(Catalan; Quer 2001: 90)
b. necessiten un alcalde [que faci grans
need.3pl a mayor that make.sbjn.prs.3sg big
inversions]
investments
‘They need a mayor that makes big investments.’
(Catalan; Quer 2001: 90)
In St’át’imcets, nominal restrictive relatives uniformly take indicative

marking, as shown in (41). The distinction which is in Catalan is encoded
by mood, is achieved by means of determiner choice in St’át’imcets (see
Matthewson 1998, 1999 for analysis).
(41) a. wa7 xat’-min’-ítas ti=kúkwpi7=a wa7

impf want-red-3pl.erg det=chief=exis impf
ka-nuk’wa7-s-tanemwít-a k=wa=s mays
circ-help-caus-3pl.pass-circ det=impf=3poss fix
ku=tsetsítcw
det=houses
‘They need a (particular) chief who can help them build houses.’
[wide-scope indefinite]
b. wa7 xat’-min’-ítas ku=kúkwpi7 wa7
impf want-red-3pl.erg det=chief impf
ka-nuk’wa7-s-tanemwít-a k=wa=s mays
circ-help-caus-3pl.pass-circ det=impf=3poss fix
ku=tsetsítcw
det=houses
‘They need a(ny) chief who can help them build houses.’
[narrow-scope indefinite]
9:22
The mood effects seen in conditionals in some Indo-European languages

are also absent in St’át’imcets. The antecedents of both notionally indicative
and subjunctive conditionals are obligatorily marked with the subjunctive,
as shown in (42), a paradigm borrowed from Quer 2009: 1780. Although
there are ways to distinguish the different types of conditionals, they do not
involve an indicative-subjunctive mood alternation.
(42) a. Context: I’m looking for John. You say:

lh=7áts’x-en=an, nílh=t’u7 s=qwál’-en-tsin
comp=see-dir=1sg.sbjn foc=prt nom=tell-dir-2sg.obj
‘If I see him, I’ll tell you.’

b. Context: I’m looking for John, and I suspect you know where he
is but you haven’t been telling me. You say:
lh=7ats’x-en=án=ka, sqwal’-en-tsín=lhkan=kelh
comp=see-dir=1sg.sbjn=irr tell-dir-2sg.obj=1sg.indc=fut
‘If I saw him, I would tell you.’

c. Context: I was looking for John, but he left town before I could
find him. You say:
lh=7ats’x-en=án=ka=tu7
comp=see-dir=1sg.sbjn=irr=then
qwal’-en-tsín=lhan=ka
tell-dir-2sg.obj=1sg.indic=irr
‘If I had seen him, I would have told you.’
The St’át’imcets subjunctive is also not like the Mohawk one. Unlike in
Mohawk, St’át’imcets futures take the indicative, as shown in (43); so do past
habituals, as shown in (44), and plain negatives, as in (45).
(43) a. ats’x-en-tsí=lhkan=kelh lh=nátcw=as

see-dir-2sg.obj=1sg.indic=fut comp=one.day.away=3sbjn
‘I’ll see you tomorrow.’
b. *ats’x-en-tsín=an=kelh lh=nátcw=as
see-dir-2sg.obj=1sg.sbjn=fut comp=one.day.away=3sbjn
‘I’ll see you tomorrow.’
9:23
Lisa Matthewson
(44) a. wa7=lhkalh=wí7=tu7 n-záw’-em ku=qú7

impf=1pl.indic=emph=then loc-get.water-mid det=water
lhel=ta=qú7qu7=a múta7 lhel=ta=tswáw’cw=a
from=det=water(pl)=exis and from=det=creek=exis
‘We used to fetch water from the spring and the creek.’
b. *wa7=at=wí7=tu7 n-záw’-em ku=qú7
impf=1pl.sbjn=emph=then loc-get.water-mid det=water
lhel=ta=qú7qu7=a múta7 lhel=ta=tswáw’cw=a
from=det=water(pl)=exis and from=det=creek=exis
‘We used to fetch water from the spring and the creek.’
(45) a. áy=t’u7 kw=en=gúy’t ku=pála7 sgap

neg=prt det=1sg.poss=sleep det=one evening
‘I didn’t sleep one night.’ (Matthewson 2005: 267)
b. *áy=t’u7 kw=s=gúy’t=an ku=pála7 sgap
neg=prt det=nom=sleep=1sg.sbjn det=one evening
‘I didn’t sleep one night.’
Finally, there are the cases where the St’át’imcets subjunctive does ap-
pear, with a predictable meaning difference, which are not attested in other
languages. These include the use of the St’át’imcets subjunctive to weaken
an imperative to a polite request, or to help turn a question into a statement
of uncertainty (see examples in (13)–(15) and (16)–(20) above).
I will argue below that in spite of these major empirical differences
between the St’át’imcets subjunctive and that of familiar languages, the basic
framework for mood semantics advanced by Portner (1997) can be adapted
to capture all the St’át’imcets facts. This will support Portner’s proposal
that moods are dependent on modals and place restrictions on the modal
environments in which they appear.
4 Basic framework: Portner 1997
Portner’s (1997) leading idea is that moods place presuppositions on the

modal environment in which they appear. More precisely, moods typically
restrict properties of the accessibility relation associated with a governing
modal operator (see also Portner 2003: 64). The modal operator may be
9:24
provided by a higher attitude verb or modal; it may also, in unembedded

situations, be provided by context.
For illustration, let us first see how Portner analyzes English ‘mood-
indicating may’. In each of the examples in (46), the may is not the ordinary
modal may; it is not asserting possibility. (46b), for example, does not mean
‘it is possible that it is possible that Sue wins the race.’
(46) a. Jack wishes that you may be happy.

b. It is possible that Sue may win the race.
c. May you have a pleasant journey! (Portner 1997: 190)
Portner argues that mood-indicating may presupposes that p is doxasti-

cally possible (possible according to someone’s beliefs). For example, (46a)
presupposes that Jack believes it is possible for you to be happy. He provides
the analysis in (47).
(47) For any reference situation r , modal force F , and modal context R,
Jmay dep (φ)Kr ,F ,R is only defined if φ is possible with respect to
Doxα (r ), where α is the denotation of the matrix subject.
When defined, Jmay dep φKr ,F ,R = JφKr ,F ,R (Portner 1997: 201)
Portner further argues that there are actually two mood-indicating may’s,
with slightly different properties. Mood-indicating may under wish, pray,
etc. (as in (46a)) or in unembedded clauses (as in (46c)) has an extra require-
ment: it presupposes that the accessibility relation R is buletic (deals with
somebody’s wishes or desires).
The discussion of mood-indicating may illustrates an important aspect
of Portner’s analysis, namely that moods place presuppositions on the modal
accessibility relation (a type of conversational background). With English
mood-indicating may, there is a doxastic and sometimes a buletic restriction.
For the English mandative subjunctive, which appears in imperatives as well
as in embedded contexts as in (48), R must be deontic, as shown in (49).
(48) Mary demands that you join us downstairs at 3pm. (Portner 1997: 202)
(49) For any reference situation r , modal force F , and modal context R,
Jm-subj(φ)Kr ,F ,R is only defined if R is a deontic accessibility relation.
When defined, Jm-subj(φ)Kr ,F ,R = JφKr ,F ,R (Portner 1997: 202)
9:25
Lisa Matthewson
For Italian moods, Portner claims that R is restricted to being (non-)factive.19

The idea that moods restrict modal conversational backgrounds is common
to several other modal-based analyses of mood (e.g., Farkas 1992 and Giorgi &
Pianesi 199720 ), and is also found in James 1986. What James calls ‘manners
of representation’ are root vs. epistemic conversational backgrounds:
The ambiguity of the modal auxiliaries . . . supports the hypoth-

esis that there are two separate manners of representation.
Moods . . . signify manners of representation. They are not am-
biguous, however; they signify one modality or the other (James
1986: 15).
In the analysis to follow, I will adopt Portner’s idea that moods place
restrictions on a governing modal operator. I will argue that the empirical
differences between the St’át’imcets subjunctive and Indo-European sub-
junctives derive from the fact that the former restricts the conversational
background of the modal operator in such a way that the modal force is
weakened.
5 Adapting Portner’s approach for the Statimcets subjunctive
I deal here only with the constructions where the subjunctive has a semantic
effect; I will not address the cases of obligatory subjunctive agreement which
were presented in subsection 2.2.21 My analysis will account for all meaningful
uses of the St’át’imcets subjunctive except the two uses which contain the
particle t’u7. See Section 8 for some discussion of the t’u7-constructions.
19 Interestingly, the Italian indicative imposes a modal force restriction as well as a conver-
sational background restriction; it is only used with a force of necessity (Portner 1997:
197).
20 According to Giorgi and Pianesi, the subjunctive indicates that the ordering source is non-
empty; this is a restriction on a conversational background.
21 The analysis presented below is actually compatible with the obligatory presence of the
subjunctive in if -clauses introduced by lh=, and may even help to explain why lh= obligatorily
selects the subjunctive when it means ‘if’, but selects indicative when it means ‘before’.
Thanks to Henry Davis for discussion of this point, and see Davis 2006: chapter 26. (See also
van Eijk 1997: 217, although van Eijk analyzes the subjunctive-inducing lh= as distinct from
(e)lh= ‘before’.) As for the other obligatory cases of subjunctive, these may be grammaticized,
semantically bleached relics of original meaningful uses, aided by the fact that subjunctive
marking is intertwined with person agreement.
9:26
5.1 The St’át’imcets subjunctive presupposes rather than asserts a modal

semantics
The first thing to establish is that like Portner’s moods, the St’át’imcets
subjunctive does not itself assert a modal semantics, but is dependent on
a governing modal operator. One piece of evidence for this is that the
St’át’imcets subjunctive must co-occur with an overt modal in almost all its
uses. Of the seven uses of the subjunctive being analyzed here, five of them
have an overt modal (the deontics, ‘pretend’, wh-questions, yes-no questions,
ignorance free relatives), one of them is plausibly analyzed as containing a
covert modal (imperatives), and only one is non-modal (plain assertions). As
noted above, the addition of the subjunctive to plain assertions is extremely
restricted and at least semi-conventionalized. If the subjunctive were itself
independently modal, it would be difficult to explain the minimal contrasts
in (50)–(51).22
(50) a. *gúy’t=as ti=sk’úk’wm’it=a

sleep=3sbjn det=child=exis
Attempted: ‘I hope the child sleeps.’
b. gúy’t=as=ka ti=sk’úk’wm’it=a
(51) a. *skalúl7=acw: saq’w knáti7 múta7 em7ímnem

owl=2sg.sbjn fly deic and make.animal.noise
b. skalúl7=acw=ka: saq’w knáti7 múta7 em7ímnem
owl=2sg.sbjn=deon fly deic and make.animal.noise
Furthermore, just like with English mood-indicating may, the interpre-

tation of St’át’imcets subjunctive clauses indicates that the mood does not
22 As noted above, Portner’s analysis does allow for unembedded uses of non-indicative moods,
with the modal accessibility relation being provided by context. So there is no problem with
the cases where the St’át’imcets subjunctive can appear without a c-commanding modal
(as in (5)–(6)). Of course, we would eventually like to explain when these unembedded
subjunctives can and cannot appear. Portner (1997: 201) notes for mood-indicating may and
the mandative subjunctive that ‘Neither of these have a completely predictable distribution,
in that neither occurs in every context in which a purely semantic account would predict
that it could . . . it must be admitted that lexical and syntactic idiosyncracies come into play.’
9:27
Lisa Matthewson
itself contribute modal semantics. For example, (50b) does not mean ‘It
should be the case that the child should sleep’.
The St’át’imcets subjunctive also patterns morphosyntactically like a
mood rather than like real modals in the language. As shown above, the
subjunctive is obligatorily selected by some complementizers, unlike modals.
The subjunctive is also fused with subject marking into a full paradigm, unlike
the modals, which are independent second-position clitics.23 I therefore
conclude that the St’át’imcets subjunctive does not itself introduce a modal
operator, but requires one in its environment.
5.2 The St’át’imcets subjunctive does not presuppose a particular con-

versational background
The Statimcets subjunctive differs from most Indo-European moods in that

it cannot be analyzed as being restricted to a certain type of conversational
background. This is illustrated by the fact that it allows deontic, buletic or
epistemic uses. Deontic conversational backgrounds arise with imperatives,
as in (52) or (14b), repeated here in (53):
(52) ets7á=has=(malh) lh=xílh-ts=al’ap

deic=3sbjn=(adhort) comp=do-caus=2pl.sbjn
‘Could you do it like this, you folks?’
(53) nás=acw=malh áku7 pankúph=a

go=2sg.sbjn=adhort deic Vancouver=exis
‘You could go to Vancouver.’
Buletic conversational backgrounds arise with the modal ka:
(54) plan=as=ká=ti7=t’u7 wa7 máys-n-as

already=3sbjn=deon=demon=prt impf fix-dir-3erg
‘I wish he had fixed that already.’
(55) guy’t=ás=ka ti=sk’úk’wm’it=a

23 Or in one case, a circumfix on the verb; see Davis et al. 2009.
9:28
And epistemic conversational backgrounds arise with questions.
(56) nká7=as=kelh lh=cúz’=acw nas

‘Wherever will you go?’ / ‘I wonder where you are going to go now.’
(adapted from Davis 2006: chapter 24)
(57) lan=as=há=k’a kwán-ens-as

already=3sbjn=ynq=infer take-dir-3erg
ni=n-s-mets-cál=a
det.abs=1sg.poss-nom=write-act=exis
‘I wonder if she’s already got my letter.’ / ‘I don’t know if she got my
letter or not.’
These data suggest that the St’át’imcets subjunctive is not analyzable in

the same way as the European moods discussed by Portner (1997), which
hardwire a restriction to a particular type of conversational background.
5.3 Instead, the St’át’imcets subjunctive functions to weaken the modal

force
The core idea of my proposal is that the St’át’imcets subjunctive restricts its
governing modal only in such a way as to weaken the force of the proposi-
tion expressed. The intuition that the St’át’imcets subjunctive weakens the
proposition it adds to was already expressed by Davis (2006: chapter 24):
The best way to characterize this meaning difference is in terms

of the ‘force’ of a sentence. With ordinary indicative subjects,
a sentence expresses a straightforward assertion, question or
command; but with subjunctive subjects, the effect is to weaken
the force of the sentence, so that an assertion becomes a wish,
a question becomes a conjecture, and a command becomes a
request.
The important question is what exactly is meant by ‘weakening’ in this

context, and how to derive the various effects of the subjunctive in a unified
way. I will claim that the St’át’imcets subjunctive restricts the conversational
background of a governing modal in such a way that the modal imparts a
force no stronger than weak necessity. Since there are no modals which
9:29
Lisa Matthewson
lexically encode quantificational force in St’át’imcets, this will mean that the
subjunctive must appear in the scope of a variable-force modal, and will
restrict it to a weakened interpretation.
6 Analysis
The idea to be pursued is that the St’át’imcets subjunctive restricts the

domain of quantification of a c-commanding modal, so that the interpretation
which obtains is weaker than pure necessity.24 Rullmann et al. (2008) argue
that St’át’imcets possesses no modals which are lexically restricted for a
pure necessity reading (see also Matthewson et al. 2007 and Davis et al.
2009). Instead, all St’át’imcets modals seem to allow both weak and strong
interpretations (see (3) above, and see the references cited for many more
examples). So, what we need to say is that the subjunctive forces an already
potentially weak c-commanding modal to have a weak reading. In order to
see how this will work, I first very briefly review the basics of a Kratzerian
analysis of modals, and then outline how modals in St’át’imcets are analyzed.
We will then add the subjunctive.
Modals in a standard analysis introduce quantifiers over possible worlds.
The set of worlds quantified over is narrowed down by two conversational
backgrounds. First, it is narrowed down by the modal base, and then it is
ordered and further narrowed down by the ordering source. The modal base
and the ordering source are both usually provided by context in English,
although there are systematic contributions of tense and aspect to the con-
versational background (see e.g., Condoravdi 2002 for discussion). A simple
example is given in (58).
(58) Chris must do his homework.
Modal base (circumstantial): The set of worlds in which the relevant

facts are the same as in the actual world (e.g., we ignore worlds where
Chris is not in school).
Ordering source (normative): Orders worlds in the modal base so
that the best worlds are those which come closest to the ideal repre-
sented by the school’s homework regulations.
Universal quantification: In all the best worlds, Chris does his home-
work.
24 I would like to thank David Beaver and three anonymous reviewers for helping me clarify
aspects of the analysis and its presentation.
9:30
Rullmann et al. (2008) argue that there are two differences between English
universal modals like must and St’át’imcets modals. First, the St’át’imcets
modals place presuppositions on the conversational backgrounds. Second,
the set of best worlds is further narrowed down by a choice function which
picks out a potentially proper subset of the best worlds to be quantified
over. This can lead to a weaker reading, depending on context. The idea is
illustrated informally in (59).25
(59) gúy’t=ka ti=sk’úk’wm’it=a

sleep=deon det=child=exis
‘The child must/should/can sleep.’
Modal base (presupposed to be circumstantial): Worlds in which the

relevant facts about our family are the same as in the actual world.
Ordering source (presupposed to be normative): The best worlds
are those in which my desire for an early night is fulfilled.
Choice function: Picks out a potentially proper subset of the best
worlds.
Universal quantification: In all worlds in the subset of the best worlds
picked out by the choice function, the child sleeps.
Since the quantification is over a potentially proper subset of the best

worlds, sentences like (59) can be interpreted with any strength ranging
from a pure possibility (‘The child can/may sleep’) to a strong necessity
(‘The child must sleep’). The apparent variable quantificational force of
St’át’imcets modals is thus derived not by ambiguity in the quantifier itself,
but by restricting the size of the set of worlds quantified over by the universal
quantifier. The larger the subset of the best worlds selected by the choice
function, the stronger the proposition expressed. As a limiting case, the
choice function may be the identity function. This results in a reading that is
equivalent to the standard analysis of strong modals like must in English.
Now we turn to the subjunctive. In order to capture the idea that the
subjunctive weakens the c-commanding modal, I analyze the subjunctive as
presupposing that at least one world in the set of best worlds is a world
in which the embedded proposition is false. This will prevent the choice
25 A very sensible suggestion that we should replace Rullmann et al.’s choice function with
an(other) ordering source has been made independently by Kratzer (2009), Portner (2009),
and Peterson (2009, 2010). I will in fact do this below when I compare the current analysis to
that of von Fintel & Iatridou (2008).
9:31
Lisa Matthewson
function from being the identity function.26 This is illustrated informally for
a deontic case in (60).
sleep=3subj=deon det=child=exis
Modal base (presupposed to be circumstantial): Worlds in which the

relevant facts about our family are the same as in the actual world.
Ordering source (presupposed to be normative): The best worlds
are those in which my desire for an early night is fulfilled.
Choice function (must pick out a proper subset of the best worlds, to
avoid a contradiction with the presupposition of the subjunctive): The
very best worlds are those in which my spouse’s desire for an early
night is also fulfilled.
Universal quantification: All the very best worlds are worlds in which
the child sleeps.
(59) allows a strong interpretation which (60) disallows. If the choice
function in (59) is the identity function, the speaker will be satisfied only
if the child sleeps (‘in all the worlds where my desire for an early night is
fulfilled, the child sleeps’). In (60), the speaker will certainly be satisfied if
the child sleeps, but there are also other ways to make him/her happy. (60)
asserts only that ‘in all the worlds where my and my spouse’s desires for an
early night are fulfilled, the child sleeps’ — so the speaker’s desires may be
satisfied if the speaker’s spouse looks after the child while the speaker goes
to sleep. The requirement that (60) places on the child is thus weaker than a
strong necessity.
In the remainder of this section I provide a more formal implementation
of this idea, and in Section 7 I show how the analysis accounts for a wide
range of uses of the St’át’imcets subjunctive, including imperative-weakening,
question-weakening, and ignorance free relatives.
26 Thanks to Hotze Rullmann (p.c.) for discussion of this point. The requirement that p be false
in at least one of the best worlds appears reminiscent of a nonveridicality-style analysis,
and there may be some deep significance to this. However, the analyses are different. For
Giannakidou, the issue is always epistemic, as veridicality is defined in terms of a truth
entailment in an individual’s epistemic model; see (37). Thus, subjunctive is predicted
under verbs like ‘want’, as propositions under ‘want’ are not entailed to be true in any
individual’s epistemic model. Under my analysis, the subjunctive has an anaphoric modal
base and ordering source. I will show in subsection 7.5 that my analysis correctly predicts
the indicative under verbs like ‘want’ in St’át’imcets.
9:32
I adopt the following basic definitions from von Fintel & Heim 2007. (61)
shows the ordering of worlds according to how well they satisfy the set of
propositions in the ordering source, and (62) shows how the best worlds are
selected.
(61) Given a set of worlds X and a set of propositions P , define the strict
partial order <P as follows:
∀w1 , w2 ∈X : w1 <P w2 iff {p∈P : p(w2 ) = 1} ⊂ {p∈P : p(w1 ) = 1}
For any worlds w1 and w2 , w1 comes closer to the ideal set up by
the ordering source than w2 does iff the set of propositions in the
ordering source which are true in w2 is a proper subset of the set of
propositions in the ordering source which are true in w1 .
(62) For a given strict partial order <P on worlds, define the selection
function maxP that selects the set of <P -best worlds from any set X
of worlds:
∀X ⊆ W : maxP (X) = {w ∈ X : ¬∃w 0 ∈ X : w 0 <P w}
(von Fintel & Heim 2007: 55)
The best worlds are those for which there are no worlds closer to the ideal
than they are. The analysis of English must is given in (63). must takes as
arguments a modal base, an ordering source and a proposition, and asserts
that in all the best worlds in the modal base, as defined by the ordering
source, the proposition is true.27,28
(63) Jmust Kc,w = λhhs,hst,tii .λghs,hst,tii .λqhs,ti .∀w 0 ∈maxg(w) (∩h(w)) : q(w 0 ) = 1
The analysis of St’át’imcets normative ka is given in (64). ka takes as

arguments a modal base, an ordering source, and a proposition. fc represents
the contextually given choice function.
27 Nothing crucial hinges on having the conversational backgrounds present in the syntax (as
in von Fintel & Heim 2007) rather than being parameters of interpretation (as in Portner
1997). However, the syntactic version may have a potential advantage in enforcing the
required anaphoricity of the conversational backgrounds once we bring in the subjunctive.
In Rullmann et al.’s (2008) analysis of St’át’imcets modals, the choice function is also a
syntactic argument of the modal. Following the suggestion of an anonymous reviewer, I have
changed this here, but again, nothing crucial hinges on the decision.
28 As an anonymous reviewer reminded me, English must also encodes restrictions on its
modal base and ordering source, parallel to (but obviously different from) those defined for
ka in (64). See for example von Fintel & Gillies 2010 and Matthewson 2010, to appear for
discussion.
9:33
Lisa Matthewson
(64) Jka(h)(g)Kc,w is only defined if h is a circumstantial modal base and

g is a normative ordering source.
If defined, Jka(h)(g)Kc,w = λqhs,ti .∀w 0 ∈fc (maxg(w) (∩h(w))) : q(w 0 ) = 1
(adapted from Rullmann et al. 2008: 340)
Now for the subjunctive. As shown in (65), the subjunctive does not affect
truth conditions but merely enforces a weaker-than-necessity reading of a
modal in the environment. The subjunctive does not itself introduce any
conversational backgrounds; h and g in (65) are free variables. I assume
that this enforces anaphoricity: the mood must be c-commanded by a modal
which introduces h and g.29
(65) Jsbjn(φ)Kc,w is only defined if ∃w 0 ∈ maxg(w) (∩h(w))[φ(w 0 ) = 0].

0
When defined, Jsbjn(φ)Kc,w = λw 0 .J(φ)Kc,w
According to (65), the subjunctive is only defined if there is at least one

world w’ in the set of best worlds in the modal base, as defined by the
ordering source, such that φ is false in w 0 . The analysis is applied to a
normative subjunctive case in (66).

sleep=3subj=deon det=child=exis
Jka(h)(g)(as(guy’t ti sk’úk’wm’ita))Kc,w is only defined if
i. h is a circumstantial modal base and g is a normative ordering

source
ii. ∃w 0 ∈ maxg(w) (∩h(w)) [the child doesn’t sleep in w 0 ]
When defined, Jka(h)(g)(as(guy’t ti sk’úk’wm’ita)) Kc,w =1 iff

∀w 0 ∈ fc (maxg(w) (∩h(w))) [the child sleeps in w 0 ]
As above, maxg(w) (∩h(w)) picks out the best worlds in the modal base,
as defined by the normative ordering source. The contextually determined
choice function fc picks out a subset of maxg(w) (∩h(w)), and the modal
universally quantifies over the set picked out by the choice function. Be-
cause the subjunctive mood presupposes that there is at least one world
29 Thanks to an anonymous reviewer for pointing out an inconsistency in an earlier version of
(65).
9:34
in maxg(w) (∩h(w)) in which the proposition is false, the choice function

must pick out a proper subset of the worlds provided by the modal base
and ordering source. This forces a weaker-than-universal reading. We in
fact predict gradient readings with the subjunctive — anything from pure
possibility to weak necessity. This seems to fit with the facts about when the
subjunctive is felicitous.
I have so far been simply following Portner (1997) in modeling the mood
restriction as a presupposition, rather than as ordinary asserted content, or
some other kind of inference. The question arises of whether there is any
St’át’imcets-internal justification for the assumption that presupposition is
involved.30
If the subjunctive contributed ordinary asserted content, we would predict
that it would fail to project through presupposition holes such as negation or
conditionals, and that it could be directly affirmed or denied by the hearer.
The issue of projection through presupposition holes is not testable for
most of the relevant constructions in St’át’imcets. For example, negation in
St’át’imcets is a predicate which embeds an obligatorily nominalized (i.e.,
indicative) subordinate clause. When a subjunctive clitic does co-occur with
negation, it attaches to the negation itself, as shown in (67). Thus, while (67) is
not interpretable in a way which would show that the subjunctive contributed
asserted content, the results are not conclusive because the subjunctive is
probably not scoping under negation syntactically.
(67) cw7aoz=as=ká=t’u7 kw=s=nas=ts

neg=3sbjn=deon=prt det=nom=go=3poss
‘I wish he wouldn’t go.’ (van Eijk 1997: 214)
≠ ‘It is not the case that [in at least one of the best worlds in the
modal base, he doesn’t go, and in all of the set of worlds selected by
the choice function, he goes].’
i.e, ≠ ‘It is not the case that [it’s good if he goes, and I can still be
happy if he doesn’t].’
Nor can we test projection through ‘if’, as ‘if’-clauses obligatorily and re-
dundantly select the subjunctive in St’át’imcets (see subsection 2.2). However,
questions provide evidence that the subjunctive does not contribute ordinary
asserted content. Recall that the subjunctive plus an inferential evidential
30 Thanks to David Beaver and an anonymous reviewer for asking for clarification of this issue.
9:35
Lisa Matthewson
when added to a question results in a statement of uncertainty (16)–(20).

The question in (68) cannot be interpreted as if the subjunctive contributed
asserted content which scopes below the question. (See subsection 7.2 for
analysis of questions like (68).)
(68) nilh=as=há=k’a s=Lémya7 ku=kúkwpi7

foc=3sbjn=ynq=infer nom=Lémya7 det=chief
‘I think maybe Lémya7 is the chief / I wonder if Lémya7 is the chief.’
≠ ‘Is it the case that [in at least one of the best worlds compatible
with the inferential evidence, Lémya7 is not the chief, and in all of the
set of worlds selected by the choice function, Lémya7 is the chief]?’
i.e, ≠ ‘Is it the case that [Lémya7 is possibly but not necessarily the
chief]?’
Further evidence that the subjunctive does not contribute ordinary as-
serted content comes from the impossibility of directly affirming or denying
its contribution. This is shown in (69), where B and B’ try to deny A’s sub-
junctive claim that in at least one world compatible with A’s knowledge and
desires, the children don’t sleep. The consultant absolutely rejects the replies
in B and B’.
(69) A guy’t=ás=ka i=sk’wemk’úk’wm’it=a

sleep=3sbjn=deon det.pl=child(pl)=exis
‘I hope the children sleep.’
B #cw7aoz kw=s=wenácw. plán=lhkacw zewát-en
neg det=nom=true already=2sg.subj know-dir
kw=s=cuz’ gúy’t=wit
det=nom=going.to sleep=3pl
‘That’s not true. You already know they will sleep.’
B’ #cw7aoz kw=s=wenácw. lh=cw7áoz=as
neg det=nom=true comp=neg=3sbjn
kw=s=gúy’t=wit i=sk’wemk’úk’wm’it=a, áoz=kelh
det=nom=sleep=3pl det.pl=child(pl)=exis neg=fut
kw=a=s áma ta=scwákwekw-sw=a
det=impf=3poss good det=heart-2sg.poss=exis
‘That’s not true. If the children don’t sleep, you won’t be happy.’
Having established that the weakening contribution of the subjunctive is

not ordinary asserted content, the question now is whether it contributes a
9:36
presupposition per se, or some other not-at-issue content, such as a Potts

(2005)-style conventional implicature. One major empirical difference be-
tween a traditional understanding of presuppositions (e.g., Stalnaker 1974)
and conventional implicatures is that only the former impose constraints on
the state of the common ground. Conventional implicatures, in contrast, stan-
dardly contribute information which is new to the hearer (Potts 2005). I have
argued elsewhere (Matthewson 2006, 2008b) that St’át’imcets entirely lacks
presuppositions of the common ground type; all not-at-issue content in this
language is treated as potentially new to the hearer.31 In those earlier works I
argued that the St’át’imcets facts necessitate an alternative analysis of pre-
supposition (for example that of Gauker 1998). However, another way to look
at things is to say that out of the class of not-at-issue meanings, St’át’imcets
lacks one sub-type, namely common ground presuppositions. What I have
modeled as a presupposition of the St’át’imcets subjunctive would then be
some other kind of not-at-issue content, perhaps a conventional implicature.
However, these issues go beyond the scope of the present paper and do not
affect the main points being made here, so with these caveats I will continue
to model the subjunctive as introducing a presupposition.
Before turning to more complex constructions involving the subjunc-
tive, it is interesting to consider the similarity between the analysis of the
St’át’imcets subjunctive provided here and von Fintel & Iatridou’s (2008) ideas
about weak necessity modals. von Fintel and Iatridou are concerned with
the difference in quantificational strength between ought and have to/must.
In (70), we see that the restriction on employees is stronger than that on
everyone else.
(70) After using the bathroom, everybody ought to wash their hands;
employees have to.
(von Fintel & Iatridou 2008: 116)
(71) also illustrates the contrast between the different modal strengths.
In (71a), taking Route 2 is the only option, if you want to get to Ashfield: all
the worlds in which you get to Ashfield are Route 2-worlds. In (71b), there
are other getting-to-Ashfield worlds apart from only Route 2-worlds. But the
Route-2 worlds are the best, taking into consideration some other factors
(such as a scenic route).
31 For example, attempts to elicit ‘Hey, wait a minute!’ responses to presupposition failures for
a wide range of standard presupposition triggers have all failed (Matthewson 2006, 2008b).
We are therefore unable to decide the presupposition issue for the subjunctive by using the
‘Hey, wait a minute!’ test (as was suggested by an anonymous reviewer).
9:37
Lisa Matthewson
(71) a. To go to Ashfield, you have to / must take Route 2.

b. To go to Ashfield, you ought to take Route 2.
von Fintel and Iatridou argue that ought is a weak necessity modal, and
that weak necessity modals signal the existence of a secondary ordering
source. This is illustrated informally in (72)–(73). (72) contains a strong
necessity modal, and gives a strong reading, as usual. In (73), a secondary
ordering source further restricts the set of worlds which are universally
quantified over, leading to a weaker reading.
(72) To go to Ashfield, you have to / must take Route 2.
Modal base: Restricts worlds considered to those in which the same

facts about roads hold as in the actual world.
Ordering source: Orders worlds in the modal base so that the best
worlds are those in which you attain your goal of getting to Ashfield.
Universal quantification: In all the best worlds, you take Route 2.
(73) To go to Ashfield, you ought to take Route 2.
Modal base: Restricts worlds considered to those in which the same

facts about roads hold as in the actual world.
Ordering source 1: Orders worlds in the modal base so that the best
worlds are those in which you attain your goal of getting to Ashfield.
Ordering source 2: Further orders the best worlds picked out by
ordering source 1, so that the very best worlds are those in which
you not only attain your goal of getting to Ashfield, but also attain an
additional goal of going via a scenic route.
Universal quantification: In all the very best worlds, you take Route
2.
As von Fintel & Iatridou (2008: 137) put it: ‘The idea is that saying that
to go to Ashfield you ought to take Route 2, because it’s the most scenic
way, is the same as saying that to go to Ashfield in the most scenic way,
you have to take Route 2.’ This is very parallel in spirit to Rullmann et al.’s
(2008) analysis of St’át’imcets modals, where a weak reading is obtained by
a universal quantifier with a restriction provided by a choice function. And
just like Rullmann et al.’s analysis, von Fintel and Iatridou’s actually predicts
gradience: how ‘weak’ a weak necessity modal is can vary, depending on
9:38
which secondary ordering source you pick. In fact, given that the motivation
for using a choice function rather than an ordering source was unconvincing
anyway (cf. Kratzer 2009, Peterson 2009, 2010, and Portner 2009), the
Rullmann et al.-style analysis is better implemented using a double ordering
source, exactly as in von Fintel & Iatridou 2008.32
So what is the difference between English and St’át’imcets? Simply that in
English, we lexically encode the weak necessity (ought vs. have to/must). In
St’át’imcets, no differences in modal force are lexically encoded by modals,
but what English modals do, St’át’imcets does via mood. Another way of
describing the analysis offered here would be to say that the St’át’imcets
subjunctive enforces weak necessity (via domain restriction): it forces there
to be two (non-vacuous) restrictions on the set of worlds in the modal base.
While further cross-linguistic investigation goes beyond the scope of this
paper, it is worth pointing out a connection to another intriguing observation
of von Fintel and Iatridou’s, namely that in many languages, weak necessity
modals are created transparently from a strong necessity modal plus coun-
terfactual morphology. This is illustrated in (74) for French, where the modal
appears in the conditional mood, the one which occurs in counterfactual
conditionals.
(74) tout le monde devrait se laver les mains mais les serveurs
everybody must/cond refl wash the hands but the waiters
sont obligés
are obliged
‘Everybody ought to wash their hands but the waiters have to.’
This is very reminiscent of St’át’imcets, where a modal which introduces

universal quantification gives rise to weak necessity interpretations in the
presence of the subjunctive. In St’át’imcets, I have analyzed the weakening
effect as the sole contribution of the subjunctive mood. Of course, ‘counter-
factual’ and ‘subjunctive’ are not the same thing, and I am not in a position
to claim that the current analysis of the subjunctive can extend to counter-
factual morphology in the languages discussed by von Fintel and Iatridou.
However, the present analysis at the very least supports von Fintel and Iatri-
dou’s cross-linguistic generalization that mood morphology can derive weak
32 Like von Fintel and Iatridou, I omit a formal definition of a modal with a double ordering
source; see von Fintel & Iatridou 2008: 138 for some suggestions on how to do this.
9:39
Lisa Matthewson
necessity interpretations, and may offer a potential new avenue for looking
at languages like French.
7 Applying the analysis to other subjunctive constructions
In the previous section I presented an analysis of the St’át’imcets subjunctive

and applied it to cases involving a normative modal. In this section I aim
to establish that the analysis of the subjunctive as restricting the conver-
sational background of a co-occurring modal can extend to the other uses
of the subjunctive. I deal in turn with imperatives (subsection 7.1), ques-
tions (subsection 7.2), ignorance free relatives (subsection 7.3), the ‘pretend’
cases (subsection 7.4), and finally I return to the fact that in St’át’imcets, the
subjunctive is not licensed by any attitude verbs (subsection 7.5).
7.1 Imperatives
Recall that the subjunctive, when added to an imperative, makes the com-
mand more polite. An example is repeated here:
(75) a. lts7á=malh lh=kits-in’=ál’ap!

deic=adhort comp=put.down-dir=2pl.sbjn
‘Just put it over here!’
b. lts7á=has=malh lh=kits-in’=ál’ap
deic=3sbjn=adhort comp=put.down-dir=2pl.sbjn
‘Could you put it down here?’/‘You may as well put it down over
here.’
The easiest way to analyze the imperatives would be as sub-cases of the

deontic cases already analyzed above. We could say that the imperative
introduces a deontic necessity modal, and the subjunctive weakens the
proposition expressed. That is what I will in fact say, adopting Schwager’s
(2005, 2006) analysis of imperatives.
Schwager (2005, 2006) claims that imperatives introduce a modal opera-
tor, which is a more restricted version of a deontic necessity modal.33 Nor-
mally, the imperative modal expresses necessity, with the Common Ground
33 See Han 1997, 1999 for an earlier proposal of a similar idea. Han’s modal analysis shares
many of the advantages for St’át’imcets of Schwager’s approach. However, since Han models
the modal claim of the imperative as a presupposition rather than part of the assertion,
extra assumptions would be required to apply it to St’át’imcets subjunctive imperatives.
9:40
serving as the modal base, and a contextually given set of preferences giv-
ing the ordering source. In addition, imperatives carry presuppositions, as
shown in (76). The presuppositions restrict an imperative to situations where
a performative use of a deontic modal would be possible, namely those in
which the speaker is an authority on the matter.34
(76) Presuppositions of an imperative:
1. The speaker is an authority on the parameters. [modal base and

ordering source]
2. The ordering source is preference-related.35
3. The speaker affirms the ordering source as a good maxim for
acting in the given scenario. (Schwager 2006: 248-249)
A simple case is illustrated in (77).
(77) Get up!
Modal base: What the speaker and hearer jointly take to be possible
Ordering source: The speaker’s commands
(77) is true iff all worlds in the Common Ground that make true as much as
possible of what the speaker commands at the world and time of utterance
make it true that the addressee gets up within a certain event frame t
(Schwager 2005: chapter 6). The difference between (77) and the plain modal
statement ‘You must get up’ is that with the imperative, the speaker is
presupposed to be an authority. This has the consequence that whenever an
imperative is defined, it is necessarily true.
Adopting Schwager’s analysis enables us to treat the St’át’imcets sub-
junctive imperatives the same way we treated the weakened normative ka-
statements above. We have to assume that the deontic modal in a St’át’imcets
imperative is, like the overt ka, a universal modal which introduces a choice
function or secondary ordering source. While a normal imperative roughly
says that in all the best worlds (the worlds where you obey my commands),
34 The descriptive vs. performative use of a deontic modal is shown in (i), from (Schwager
2008: 26).
(i) a. Peter may come tomorrow. (The hostess said it was no problem.) descriptive
b. Okay, you may come at 11. (Are you content now?) performative
35 The preferences may relate to the addressee’s wishes, as in the case of advice or suggestions.
9:41
Lisa Matthewson
you do P, a subjunctive imperative presupposes that at least one world in

which you obey my commands is a world in which you do not do P. This
predicts that a weakened imperative means that in the very best worlds,
you do P, but there are other ways to satisfy me. The requirement on the
addressee becomes weaker, just as the requirement on the child to sleep
becomes weaker in the examples discussed above.
An advantage of Schwager’s analysis for St’át’imcets is that it makes the
correct predictions for ‘permission imperatives’ like ‘Have a cookie!’ These
do not perform a speech act of ordering, but rather of invitation. It might be
natural to think that permission imperatives involve a possibility modal, but
Schwager argues that imperatives always introduce a necessity operator. For
Schwager, the permission effect arises due to the contextual parameters; this
is shown in (78).
(78) Take an apple if you like!
Given what we know the world to be like and given what you want, it
is necessary that you take an apple. (cf. Schwager 2008: 49)
Under Schwager’s analysis, then, the difference between an order and an

invitation consists not in a difference in quantificational force, but in ordering
source. This correctly predicts that in St’át’imcets, permission imperatives
do not have to take the subjunctive:36,37
(79) Context: Your friend comes over and is visiting with you. You hear
her stomach rumbling. You give her a plate and say ‘Have some cake!’
a. wá7=malh kiks-tsín-em
be=adhort cake-eat-mid
‘Have some cake!’
b. #wá7=acw=malh kiks-tsín-em
be=2sg.sbjn=adhort cake-eat-mid
‘You may as well have some cake.’
36 (79b) is marked as infelicitous in this context, which is how the consultant judges it. (80b)
appears to be ungrammatical. The difference possibly relates to the presence in (79b) of the
adhortative particle malh, an interesting element whose analysis must await future research.
37 An anonymous reviewer points out that permission imperatives should be able to take the
subjunctive in certain circumstances, meaning something like ‘the very best way to achieve
your desires is p, though there are other ways’. Future research is required to see whether
this prediction is upheld once the right discourse contexts are provided.
9:42
(80) Context: You are at a gathering and they are almost running out of
food. You take the last piece of fish and then you see an elder is
behind you and is looking disappointed and has no fish on her plate.
You say ‘Take mine!’
a. kwan ts7a ti=n-tsúw7=a
take(dir) deic det=1sg.poss-own=exis
‘Take mine!’
b. *kwán=acw ts7a ti=n-tsúw7=a
take(dir)=2sg.sbjn deic det=1sg.poss-own=exis
intended: ‘Take mine!’
We have seen that an analysis of imperatives as containing a concealed

necessity modal works for St’át’imcets. In the remainder of this section I
briefly discuss the alternative analysis of Portner (2004, 2007).
Portner’s (2004, 2007) analysis of imperatives relies on the notion of a
‘To-Do List’. The idea is that each participant in a conversation has a To-Do
List, a set of properties which they are committed to satisfying. The To-Do
list Function (which maps each participant to their own To-Do List) is a
component of the Discourse Context (along with the Common Ground and
the Question Set). An imperative, as in (81), denotes a property whose subject
is the addressee. This causes the property to be added to the addressee’s
To-Do List.
(81) JLeave! Kw∗,c = [λwλx : x = addressee (c) . x leaves in w]

Similarly to in Schwager’s analysis, ‘permission’ imperatives are dealt
with in Portner’s analysis by the counterpart of the ordering source, namely
different sub-sets of the To-Do List. The To-Do List is divided into deontic,
bouletic and teleological sub-parts, corresponding to orders, invitations, and
suggestions respectively. The addressee can therefore keep track of actions
she is supposed to take to satisfy someone’s orders, her own wishes, or her
own goals.
An important feature of this analysis is that under the To-Do List ap-
proach, imperatives do not contain modal operators. While for Portner,
imperatives and root modals are closely linked — for example, the successful
utterance of an imperative leads to the truth of a corresponding sentence
containing a root modal — imperatives do not themselves contain modals.38
38 See Portner 2007: 363ff for arguments against Han’s (1999) and Schwager’s (2005, 2006)
analysis of imperatives as containing concealed modals.
9:43
Lisa Matthewson
My analysis of the St’át’imcets subjunctive, however, seems to require the

presence of a modal, whose force is functionally weakened via a restriction
on the conversational background. A unified analysis of the St’át’imcets
subjunctive across all its uses would therefore seem to require a modal in
the imperative.
However, as pointed out by an anonymous reviewer, Portner’s analysis of
imperatives will work for St’át’imcets. The lexical entry for the subjunctive
given above in (65) does not literally require the presence of a governing
modal; it merely requires the presence of contextually available conversa-
tional backgrounds. These are provided within Portner’s analysis, given that
the Common Ground corresponds to (at least a subset of) a circumstantial
modal base, while a To-Do List corresponds to (at least a subset of) a deon-
tic, bouletic or teleological ordering source. To apply Portner’s analysis to
St’át’imcets, we only need to assume that the imperative morpheme can take
the Common Ground plus two To-Do Lists as arguments. The subjunctive
will presuppose that there is a world among the best worlds in the Common
Ground, according to To-Do List 1, in which the imperative is not satisfied.
Assuming that the second To-Do List is ‘more ignorable’ than the first (cf.
also von Fintel and Iatridou 2008 on the primacy of the first ordering source),
then a hearer can decide to be bound either by both To-Do Lists, or only by
the first. If the speaker has set up her own desires as the secondary To-Do
List, we obtain the politeness reading typical of a St’át’imcets subjunctive
imperative.
In summary, we have seen that our analysis of the St’át’imcets subjunctive
extends to the weakened imperatives, as long as we assume that imperatives
are concealed normative modal statements, or at least provide the same
conversational backgrounds as a normative modal. This idea can be im-
plemented within either the approaches of Schwager (2005, 2006, 2008) or
Portner (2004, 2007).
7.2 Questions
The subjunctive appears, in combination with an evidential or future modal,

in both yes-no and wh questions in St’át’imcets, in each case turning the
question into a statement of uncertainty. Some examples are repeated here.
Following Littell, Matthewson & Peterson (2009), I use the term ‘conjectural
question’ for this construction.
9:44
(82) a. lán=ha kwán-ens-as

already=ynq take-dir-3.erg
ni=n-s-mets-cál=a
‘Has she already got my letter?’
b. lan=as=há=k’a kwán-ens-as
already=3.sbjn=ynq=infer take-dir-3.erg
ni=n-s-mets-cál=a
‘I wonder if she’s already got my letter.’ / ‘I don’t know if she got
my letter or not.’
(83) a. nká7=kelh lh=cúz’=acw nas

where=fut comp=going.to=2sg.sbjn go
‘Where will you go?’
b. nká7=as=kelh lh=cúz’=acw nas
‘Wherever will you go?’ / ‘I wonder where you are going to go
now.’ (adapted from Davis 2006: chapter 24)
Previous discussion of conjectural questions in Salish includes Matthewson

2008a, Littell et al. 2009 and Littell 2009.39 The analysis given here will
essentially be that of Littell (2009), with the addition of an account of the
role of the subjunctive (which Littell does not discuss), and an extension to
cases where the subjunctive in a conjectural question is licensed by a future
modal, rather than an evidential.
The paradigms in (84) and (85) illustrate the distributional facts for
conjectural questions which contain an evidential (as opposed to a future
modal). We see that the evidential is obligatory (the (b) examples), but
the subjunctive — while strongly preferred — is not quite obligatory (the (c)
examples).40
(84) a. t’íq=Ø=ha k=Bill

arrive=indic=ynq det=Bill
‘Did Bill arrive?’ indic
39 Littell et al. (2009) investigate conjectural questions in three languages: St’át’imcets,
NìePkepmxcín (Thompson Salish) and Gitksan, while Littell (2009) focuses mainly on
NìePkepmxcín.
40 While subjunctive evidential questions (as in (84d), (85d)) are obligatorily interpreted as
statements of uncertainty rather than questions, indicative evidential questions (as in (84c),
(85c)) can optionally be interpreted as ordinary questions. I return to this below.
9:45
Lisa Matthewson
b. *t’íq=as=ha k=Bill
arrive=3sbjn=ynq det=Bill sbjn
c. ?t’íq=ha=k’a k=Bill
arrive=ynq=infer det=Bill
‘I wonder if Bill arrived.’ evid + indic
d. t’iq=as=há=k’a k=Bill
arrive=3sbjn=ynq=infer det=Bill
‘I wonder if Bill arrived.’ evid + sbjn
(85) a. ínwat=wit
say.what=3pl
‘What did they say?’ indic
b. *inwat=wít=as
say.what=3pl=3sbjn sbjn
c. ??inwat=wít=k’a
say.what=3pl=infer
‘I wonder what they said.’ evid + indic
d. inwat=wít=as=k’a
say.what=3pl=3sbjn=infer
‘I wonder what they said.’ evid + sbjn
As argued in the above-mentioned references, conjectural questions have

the syntax and the semantics of a question, but the pragmatics of an as-
sertion (as they do not require an answer in discourse). With respect to
syntax, conjectural questions clearly pattern with ordinary questions. Littell
et al. (2009) point out that not only do conjectural questions contain the
normal yes-no question particle or sentence-initial wh-phrase plus extraction
morphology, they embed under the same predicates as ordinary questions
do. This is shown in (86).
(86) aoz kw=s=zwát-en-as k=Lisa

neg det=nom=know-dir-3erg det=Lisa
lh=wa7=as=há=k’a áma-s-as k=Rose ku=tíh
comp=impf=3sbjn=ynq=infer good-caus-3erg det=Rose det=tea
‘Lisa doesn’t know whether Rose likes tea.’
9:46
The ability to embed under question-taking predicates is prima facie

evidence that conjectural questions have the same semantic type as ordinary
questions.
Pragmatically, however, conjectural questions do not behave like ordinary
questions, because conjectural questions do not require an answer from the
addressee. In fact, conjectural questions are infelicitous in any situation
where the hearer can be assumed to know the answer. This is illustrated in
(87).41
(87) a. ??lan=acw=há=k’a q’a7

already=2sg.sbjn=ynq=infer eat
‘I wonder if you’ve already eaten.’
b. Context: You see your friend wearing a watch and you say:
??zwat-en=ácw=ha=k’a
know-dir=2sg.sbjn=ynq=infer
lh=k’wín=as=t’elh
comp=how.many=3sbjn=now
‘Would you know what the time was?’
Consultant’s comment: “You wouldn’t have seen the watch if
you say this.”
Nor are conjectural questions a type of rhetorical question. Han (2002)

argues that rhetorical questions have the force of a negative assertion, as in
(88).
(88) Did I tell you it would be easy? ≈ I didn’t tell you it would be easy.
But this is not the meaning we get in St’át’imcets for conjectural questions.
In order to express a true rhetorical question, St’át’imcets speakers use
something which is string-identical to an ordinary question, just as in English.
This is illustrated in (89)–(90). (90b) shows that adding a subjunctive plus an
evidential to a rhetorical question results in rejection of the utterance.
(89) Context: Your daughter is complaining that learning how to cut fish
is hard. You say:
a. tsun-tsi=lhkán=ha k=wa=s lil’q
say(dir)-2sg.obj=1sg.indic=ynq det=impf=3poss easy
‘Did I tell you it would be easy?’
41 See Rocci 2007: 147 for the same claim for an Italian construction with similar semantics to
St’át’imcets conjectural questions.
9:47
Lisa Matthewson
b. swat ku=tsút k=wa=s lil’q

who det=say det=impf=3poss easy
‘Who said it would be easy?’
(90) Context: You are at the PNE (a fair) and there is this very scary ride
which looks really dangerous. Your friend asks you if you are going
to go on it. You say:
a. tsut-anwas=kácw=ha kw=en=klíisi
say-inside=2sg.indic=ynq det=1sg.poss=crazy
‘Do you think I’m crazy?’
b. *tsut-anwas=ácw=ha=k’a kw=en=klíisi
say-inside=2sg.sbjn=ynq=infer det=1sg.poss=crazy
‘Do you think I’m crazy?’
The status of speaker and addressee knowledge also differs between rhetori-
cal questions and conjectural questions. In rhetorical questions, the speaker
knows the true answer to the question, and typically assumes that the hearer
does as well (e.g., Caponigro & Sprouse 2007). Subjunctive questions are the
exact opposite: neither the speaker nor the addressee typically knows the
answer.
In the remainder of this section I will first present the analysis of conjec-
tural questions which contain evidentials, and then explain an interesting
difference between the evidential and the future with respect to subjunctive
licensing.
First, we need an analysis of questions. I adopt a fairly standard approach,
according to which a question denotes a set of propositions, each of which
is a (partial, true or false) answer to the question (Hamblin 1973).42 This is
illustrated in (91)–(92).
(91) Jdoes Hotze smokeKw = {that Hotze smokes, that Hotze does not
smoke}
(92) Jwho left me this fishKw = {that Ryan left me this fish, that Meagan
left me this fish, that Ileana left me this fish,...} = {p : ∃x[p = that x
left me this fish]}
42 As far as I am aware, this choice is not critical and a different approach to questions would
work just as well.
9:48
Next, we need an analysis for the inferential evidential k’a. I adopt

Matthewson et al.’s (2007) and Rullmann et al.’s (2008) analysis of k’a as an
epistemic modal with a presupposition about evidence source.
(93) Jk’a(h)(g)Kc,w is only defined if h is a epistemic modal base, g is a

stereotypical ordering source, and for all worlds w 0 , ∩h(w 0 ) is the set
of worlds in which the inferential evidence in w holds.
If defined, Jk’a(h)(g)Kc,w =
λqhs,ti .∀w 0 ∈ fc (maxg(w) (∩h(w)))[q(w 0 ) = 1]
(adapted from Matthewson et al. 2007: 245)
I assume that the evidential modal scopes under the question operator,
so that each proposition in the question denotation contains the evidential.
A conjectural question thus bears some similarity to an English question
containing a possibility modal (e.g., ‘Could Bill have (possibly) arrived?’), with
the additional factor that the evidential introduces a presupposition about
evidence source. Following Guerzoni (2003), I assume that when a question
contains a presupposition trigger, each proposition in the alternative set
carries the relevant presupposition. The question therefore denotes a set of
alternative partial propositions. This is illustrated in (94).43
(94) a. t’iq=as=há=k’a k=Bill

arrive=3sbjn=ynq=infer det=Bill
‘I wonder if Bill arrived.’
b. Alternatives introduced by (94a):
{that Bill possibly arrived [presupposing there is inferential evi-
dence that Bill arrived], that Bill possibly did not arrive [presup-
posing there is inferential evidence that Bill did not arrive]}
Notice that the evidence presuppositions of the two propositions in (94b)

conflict with each other — there is presupposed to be evidence both that Bill
did arrive, and that Bill did not arrive. As Guerzoni (2003) has shown for the
presuppositions of English even, questions whose alternative propositions
introduce different presuppositions end up presupposing the conjunction of
all the individual presuppositions. Take, for example, the question in (95).
(95) Guess who even solved Problem 2?

43 Recall that although (94a) is translated into English using wonder, the meaning of (94a) does
not include an attitude verb. The claim is that (94a) denotes a set of alternative propositions.
9:49
Lisa Matthewson
This question introduces ‘a set of alternative partial propositions that for

each relevant person x contains an answer asserting that x solved Problem
2 and presupposing that solving problem 2 was less likely for x than solving
any other relevant problem’ (Guerzoni 2003: 127). Guerzoni then observes
that a speaker who utters (95)
knows that for any arbitrary individual in the restrictor of who,

if the addressee answers that that individual solved the prob-
lem, he will automatically presuppose that the problem was
difficult for that person. Moreover, if the speaker is unbiased,
she doesn’t know in advance (and has no expectations regard-
ing) which propositions will be chosen by the addressee as the
true answer to her question. Given this, it must be the case
that she is taking for granted that the problem was hard for
every arbitrary x in the restrictor of who. Since the addressee
will be able to infer this much, the question is a presupposition
failure unless this condition is indeed satisfied in the context
of the conversation (Guerzoni 2003: 128).
Applying this idea to the St’át’imcets conjectural questions, we obtain the

result that an utterance of (94a) commits the speaker to the presupposition
that there is evidence both that Bill did arrive, and that he did not. This is
illustrated in (96).
(96) Alternatives introduced by (94a):

{that Bill possibly arrived, that Bill possibly did not arrive}
Presupposition of (94a):
There is inferential evidence both that Bill arrived and that Bill did
not arrive
In previous work (Matthewson 2008a, Littell et al. 2009), I assumed that

the mixed-evidence presuppositions which result when we conjoin the pre-
suppositions of all the propositions in the question set could derive the
reduced interrogative force of conjectural questions. The idea was that a
speaker who utters a question while presupposing that there is mixed or
even contradictory evidence about the true answer cannot be taken to be
requiring that the hearer provide the true answer to the question. That is,
the mixed presuppositions about evidence signal that the speaker does not
9:50
believe the question is easily answerable, and this lets the hearer off the hook
with respect to providing an answer.44
However, there are various problems with this analysis, as pointed out
by Littell (2009). One is that the evidence presuppositions are not always
contradictory. For example, a conjectural question such as ‘Who likes ice
cream?’ would presuppose for each contextually salient individual x that
there is inferential evidence that x likes ice cream. But it is perfectly possible
that everyone likes ice cream, and the evidence presuppositions in this case
do not rule out the possibility that the hearer knows the true answer. A
second problem is seemingly incorrect predictions about questions which
contain other evidentials, such as reportative or direct evidentials. Littell
argues that an analysis of conjectural questions which relies on conjoined
evidence presuppositions should predict reduced interrogative force for
any evidential question — yet cross-linguistically it is overwhelmingly only
inferential or conjectural evidentials which result in reduced interrogative
force. This is certainly true of St’át’imcets, as shown in the minimal pair in
(97).45
(97) a. stám’=as=k’a ts7a

what=3sbjn=infer here
‘I wonder what these are.’
b. *stám’=as=ku7 ts7a
what=3sbjn=report here
For these reasons, I instead adopt and extend an analysis proposed by

Littell (2009). Two assumptions are required. First, the evidence source
44 Rocci (2007) analyzes a construction in Italian with strikingly similar semantics and prag-
matics: the che-subjunctive construction. According to Rocci, che-subjunctives, which are
formed from questions, are interpreted as statements of doubt. He argues that they involve
epistemic modality and inferential evidentiality, and induce the following presuppositions:
(i) p is not in the Common Ground and ¬p is not in the Common Ground
(ii) There is no sign that either Speaker or Hearer knows whether p or ¬p
(iii) There is some set of facts E in CG, such that E is non-conclusive evidence in favor
of p
These are very similar to the effects of the St’át’imcets conjectural questions. However, Rocci
does not give a compositional analysis, perhaps partly because the che-subjunctives have no
overt evidentials or epistemic modals in the structure.
45 Cheyenne is an exception; reportatives in questions in Cheyenne allow non-interrogative
readings under certain circumstances (Murray to appear).
9:51
Lisa Matthewson
requirement of an evidential in a question can or must undergo ‘interrogative

flip’ (or ‘origo shift’; Garrett 2001, Faller 2002, 2006, Aikhenvald 2006, Tenny
& Speas 2004, Tenny 2006, Davis, Potts & Speas 2007, Murray to appear,
among others). Thus, a question containing an evidential expects that the
hearer, rather than the speaker, has the relevant type of evidence for the
answer. For example, (98) is not appropriate if directed to your mother, if she
is the one who always cooks dinner. However, it is acceptable when directed
to a third person, who might have heard from your mother what you are
going to eat.
(98) stám’=ku7 ku=cuz’=s-q’á7-lhkalh

what=report det=going.to=nom-eat-1pl.poss
‘What are we going to eat?’
The second assumption is that a speaker who uses an evidential which is

low on a hierarchy of evidence strength implicates that there is no available
evidence of a stronger type (Faller 2002, among others). This also seems to
be correct in St’át’imcets; the use of an inferential evidential, for example,
leads a hearer to infer that the speaker did not have reportative or direct
evidence.46
These two assumptions lead to the following result: a question containing
an evidential which is low on the scale of evidence strength will lead to an
implicature that the hearer does not have evidence of any stronger type. This
is illustrated in (99).
(99) a. man’c-em=há=k’a k=Hotze

smoke-mid=ynq=infer det=Hotze
‘I wonder if Hotze smokes.’
{that Hotze might smoke, that Hotze might not smoke}
c. Presupposition of (99a):
The hearer has inferential evidence both that Hotze smokes and
that Hotze does not smoke
46 Evidential hierarchies are a topic of some debate and there are many interesting questions
to be investigated (see Faller 2002 for an overview). It is also an interesting question how
evidence-type hierarchies interact with the variable interpretations of all evidentials in
St’át’imcets (Matthewson et al. 2007, Rullmann et al. 2008). Although all strengths are
possible for all evidentials in St’át’imcets, inferential k’a is more likely to be weaker (i.e., to
have a more restricted domain of worlds to quantify over), while the reportative ku7 and the
perceived-evidence =an’ are much more likely to give rise to stronger interpretations.
9:52
d. Implicature: The hearer does not have any stronger type of

evidence than inferential about the correct answer
According to Littell (2009), this analysis accounts for the reduced inter-
rogative force of conjectural questions. The idea is that inferential evidence
is a fairly weak type of evidence, and a speaker who asks a question while
implicating that the hearer only has inferential evidence about the true an-
swer is letting the hearer off the hook with respect to answering. This is
intended to account for (a) the judgments of St’át’imcets consultants that
conjectural questions do not require an answer, (b) the fact that conjectural
questions are infelicitous when the addressee is likely to know the answer (cf.
(87)), and (c) the fact that conjectural questions are translated as ‘I wonder’
or ‘maybe’-statements (although they do not literally have the semantics
of ‘wonder’). ‘I wonder’ is simply a typical method in English of raising a
question without demanding an answer.
However, this account does not seem to predict a complete absence of
interrogative force. After all, the inferential evidence the hearer is assumed
to possess is better than no evidence at all. In line with this, an English
question like ‘According to the weak evidence you have, could Hotze smoke?’
still functions pragmatically as an interrogative. I conclude, therefore, that
interrogative flip plus implicatures about the absence of stronger evidence are
not sufficient in and of themselves to completely let the hearer off the hook
with respect to answering. This is actually a welcome result, since questions
containing k’a in the indicative mood are sometimes translated by speakers
into English using ordinary questions (rather than as statements of doubt;
see footnote 40). However, conjectural questions containing the subjunctive
are never translated as ordinary questions. I therefore assume that while a
question containing an evidential is already somewhat ‘weakened’ in terms
of its interrogative force, the subjunctive performs a further weakening. The
task now is to see whether this falls out from the analysis of the subjunctive
proposed above.
Recall that in the context of a governing modal, the subjunctive adds the
presupposition that in at least one of the best worlds in the modal base, the
proposition is false. The best worlds here (as the modal is epistemic) are
those which conform to the propositions known to be true, and in which
things happen as normal. Since the evidential has undergone interrogative
flip, the epistemically accessible worlds must also be flipped to be the worlds
9:53
Lisa Matthewson
compatible with the hearer’s knowledge. The results are shown in (100).47
(100) a. cuz’=as=há=k’a ts7as s=Bill

going.to=3sbjn=ynq=infer come nom=Bill
‘I wonder if Bill is going to come.’
{that Bill is possibly going to come, that Bill is possibly not going
to come}
c. Presuppositions of (100a):
The hearer has inferential evidence both that Bill is going to
come and that Bill is not going to come; Bill doesn’t come in at
least one normal world compatible with the hearer’s knowledge,
and Bill comes in at least one normal world compatible with the
hearer’s knowledge
d. Implicature: The hearer does not have any stronger type of
evidence than inferential about the correct answer
As before, the implicature that the hearer does not have strong evidence
about the true answer, combined with the mixed-evidence effect of the
evidential presuppositions, will partially reduce the expectation that the
hearer is able to answer the question. In addition, thanks to the subjunctive,
the question now presupposes not only that the evidence about Bill’s possible
arrival is mixed, but also that there are worlds compatible with the hearer’s
knowledge in which Bill does come, and worlds compatible with the hearer’s
knowledge in which he does not come. In other words, the hearer does not
know whether he will come or not. The result is that a subjunctive conjectural
question has a significantly reduced expectation on the hearer to provide an
answer.48
The account just given, which incorporates the analysis of the St’át’imcets
subjunctive as weakening a modal proposition via domain restriction, suc-
47 An anonymous reviewer raises a potentially significant issue with the choice function
required for these cases. With the deontic and imperative cases discussed above, the choice
function had intuitive content (e.g., the ‘very best way to achieve some end’), but here the role
of the subjunctive is purely to make sure there are some ‘best worlds’ where the prejacent is
false. It is thus not clear which proper subset of the best worlds the function picks out.
48 As noted above, conjectural questions also imply that the speaker does not know the answer.
I assume that this follows, by Gricean reasoning, from the fact that the speaker uttered a
question, rather than having simply asserted the true answer. However, there is a bit more to
be said here, since plain questions in St’át’imcets allow a ‘display question’ use — a teacher
can ask (i):
9:54
cessfully accounts for the distributional and interpretive facts illustrated

in (84)–(85) above. The fact that the subjunctive requires a modal licenser
in a question follows from the analysis of the subjunctive as requiring a
governing modal. The fact that an evidential in a question always licenses
at least slightly reduced interrogative force, regardless of mood, falls out
from the fact that the evidential plays a part in reducing interrogative force.
However, the added contribution of the subjunctive accounts for the pre-
ferred presence of the subjunctive in conjectural questions, as well as for
the fact that questions containing an evidential plus the subjunctive, in con-
trast to indicative evidential questions, can only be interpreted with reduced
interrogative force.
In the final part of this section I extend the discussion to conjectural
questions which contain a future morpheme rather than an evidential. We
have already seen some examples of this ((17b)–(18b) above). In contrast to the
evidential k’a, the future modal obligatorily requires the subjunctive mood if
it is to be interpreted as a statement of doubt. This is shown in (101)–(102),
where the (a) examples are only interpretable as ordinary questions which
expect an answer.
(101) a. t’íq=ha=kelh k=Bill

arrive=ynq=fut det=Bill
‘Is Bill going to come?’ fut + indic
b. t’iq=as=há=kelh k=Bill
arrive=3sbjn=ynq=fut det=Bill
‘I wonder if Bill will come.’ fut + sbjn
(i) k’win ku=án’was múta7 án’was

how.many det=two and two
‘What is two plus two?’
As an anonymous reviewer points out, this display use should technically remain even when
the subjunctive is added. However, consultants judge the subjunctive version of (i) to no
longer be a teacher’s question, but a student’s reply:
(ii) k’wín=as=k’a ku=án’was múta7 án’was

how.many=3sbjn=infer det=two and two
‘I don’t know how much two plus two is.’
Perhaps conjectural questions like (ii) simply do not make good questions for a teacher to
ask because they encode addressee ignorance.
9:55
Lisa Matthewson
(102) a. inwat=wít=kelh
say.what=3pl=fut
‘What will they say?’ fut + indic
b. inwat=wít=as=kelh
say.what=3pl=3sbjn=fut
‘I wonder what they will say.’ fut + sbjn
The contrast between the evidential and the future with respect to whether
the subjunctive is required to create a conjectural question is striking. So
far, I have argued that the evidential k’a contributes to reduced interrogative
force by means of an implicature that the hearer has no better than inferential
evidence for the true answer, and that the subjunctive contributes to further
reduced interrogative force by presupposing that it is compatible with the
hearer’s knowledge state that each possible answer is false. Now unlike k’a,
the future modal kelh has not been analyzed as an epistemic modal, and it
does not introduce any evidence presuppositions. The denotation for kelh is
given in (103).
(103) Jkelh(h)(g)Kc,w,t is only defined if h is a circumstantial modal base

and g is a stereotypical ordering source.
If defined, Jkelh(h)(g)Kc,w,t =
λqhs,hi,tii .∀w 0 ∈ fc (maxg(w) (∩h(w, t)))[∃t 0 [t < t 0 ∧ q(w 0 )(t 0 ) = 1]]
(adapted from Rullmann et al. 2008)49
Applying this analysis of kelh to questions containing a subjunctive gives

(104).
(104) a. nká7=as=kelh lh=cúz’=as nas k=Gloria

where=3sbjn=fut comp=going.to=2sg.sbjn go det=Gloria
‘I wonder where Gloria will go.’
{that Gloria will go home, that Gloria will go to her mother’s
house, . . . }
49 I have altered Rullmann et al.’s formula to incorporate the ordering source and to make the
format parallel to that of other formulas above. The modal base in (103) is a function from
world-time pairs to sets of propositions.
9:56
c. Presuppositions of (104a):
The future claim is made on the basis of the facts; Gloria won’t
go home in at least one stereotypical world compatible with the
facts, Gloria will not go to her mother’s house in at least one
stereotypical world compatible with the facts, . . .
There are no implicatures about evidence types this time, but interestingly,
we still predict reduced interrogative force. And this time, the contribution
of the subjunctive is absolutely critical to deriving the effect. Due to the
subjunctive, the question as a whole presupposes for each contextually
salient place that Gloria might go, that there is at least one stereotypical
world compatible with the facts in which she doesn’t go there. This means
that the facts underdetermine where she might go — and thus, that the
addressee may not know where she will go. Given that the subjunctive is
crucial in deriving the reduced interrogative force, we correctly predict that
the subjunctive is obligatory in conjectural questions like (102).
7.3 Ignorance free relatives
Ignorance free relatives in St’át’imcets are formed by the combination of a

wh-word, the subjunctive, and the inferential evidential k’a. Some examples
are repeated here.50
(105) a. qwatsáts=t’u7 múta7 súxwast áku7, t’ak aylh áku7,

leave=prt again go.downhill deic go then deic
nílh=k’a s=npzán-as
foc=infer nom=meet(dir)-3erg
k’a=lh=swát=as=k’a káti7 ku=npzán-as
infer=comp=who=3sbjn=infer deic det=meet(dir)-3erg
‘So he set off downhill again, went down, and then he met who-
ever he met.’ (van Eijk & Williams 1981: 66, cited in Davis 2009)
b. o, púpen’=lhkan [ta=stam’=as=á=k’a]
oh find=1sg.indic [det=what=3sbjn=exis=infer]
‘Oh, I’ve found something or other.’
(Unpublished story by “Bill” Edwards, cited in Davis 2009)
There is a large literature on free relatives, concentrating mainly on

English (although see Dayal 1997 for discussion of Hindi and Davis 2009 for
50 Thanks to Henry Davis for helpful discussions of free relatives in St’át’imcets.
9:57
Lisa Matthewson
discussion of St’át’imcets). Here I adopt von Fintel’s (2000) analysis; as far as I

know, nothing crucial hinges on the differences between von Fintel’s analysis
and those of, for example, Jacobson (1995) or Dayal (1997). I will argue
that the St’át’imcets ignorance free relatives are compatible with von Fintel’s
proposals, and that their interpretation relies on the independently-attested
semantics of the subjunctive and the evidential.
According to von Fintel, both ignorance and indifference free relatives
presuppose that there is variation among the worlds in the modal base with
respect to the identity of the referent. The free relative denotes a definite
description, and the sentence as a whole asserts that the definite description
satisfies the relevant property.
(106) (whatever)(w)(F )(P )(Q)

a. presupposes: ∀w 0 ∈ minw [F ∩(λw 0 .ιx.P (w 0 )(x) ≠ ιx.P (w)(x))] :
Q(w 0 )(ιx.P (w 0 )(x)) = Q(w)(ιx.P (w 0 )(x))
b. asserts: Q(w)(ιx.P (w)(x)) (von Fintel 2000: 34)
With ignorance free relatives, the modal base F is the epistemic alterna-
tives of the speaker.51 Consider (107), for example.
(107) There’s a lot of garlic in whatever (it is that) Arlo is cooking.

(von Fintel 2000: 27)
(107) presupposes that in all the speaker’s epistemically accessible worlds

which are minimally different from the actual world and in which Arlo is
cooking something different from what he is actually cooking, there is the
same amount of garlic in what he is cooking. As the min-operator introduces
an existential presupposition, (107) presupposes that there are epistemically
accessible worlds in which Arlo is cooking something different from what
he is actually cooking. This amounts to a presupposition that the speaker is
ignorant about the identity of what Arlo is cooking. (107) then asserts that
the unique thing which Arlo is cooking has a lot of garlic in it.
Turning to St’át’imcets, we see that von Fintel’s semantics captures the
required meanings accurately. (105a) presupposes that the speaker does not
know who ‘he’ (the man being talked about) met, and asserts that he met
whoever he met. Moreover, it seems that we can account for the presence of
the subjunctive in free relatives, and also for the presence of the inferential
evidential. In particular, I would like to suggest that the presupposition of
51 With indifference free relatives, the modal base includes counterfactual alternatives.
9:58
speaker ignorance about the denotation of the free relative actually derives
from the evidential k’a and the subjunctive.
The basic idea is that an ignorance free relative is formed from a conjec-
tural question (see Davis 2009 for this insight, although Davis does not word
it in this way). The free relative in (105a), for example, is formed from the
conjectural question in (108).
(108) swát=as=k’a káti7 ku=npzán-as

who=3sbjn=infer deic det=meet(dir)-3erg
‘I wonder who he met.’
Following the analysis of conjectural questions given in subsection 7.2, (108)

denotes the set of propositions of the form ‘he met x’. The evidential in
(108) would normally undergo interrogative flip, giving rise to the inference
that the hearer is not in a position to answer the question of who he met.
When (108) is embedded in a non-matrix environment as in (105a), however, I
assume that interrogative flip does not take place. The free relative based on
(108) will therefore carry a conjoined presupposition that the speaker has
inferential evidence for each alternative, and an implicature that the speaker
has no stronger evidence about who he met. And due to the subjunctive,
it will presuppose that for each alternative, there is at least one best world
in the modal base in which that alternative is false. Thus, the free relative
formed from (108) will presuppose that there is mixed evidence about who he
met, and that for each person x, it’s compatible with the speaker’s knowledge
that he did not meet x. This derives the desired ‘speaker ignorance’ presup-
position. Moreover, we can regard the subjunctive as an overt spell-out of
the existential presupposition of the min-operator, namely that there are
epistemically accessible worlds in which the person he met is not who he met
in the actual world.
A final advantage of this approach is that we correctly capture the fact
that the modal base contains epistemic alternatives, as k’a lexically encodes
an epistemic conversational background. This accounts for the fact that only
ignorance free relatives, and not indifference free relatives, contain k’a in
St’át’imcets (Davis 2009).52
52 Free relatives in St’át’imcets are far from solved. For example, Davis (2009) points out
a problem with free relatives which surface as DPs, as in (105b) above. Davis shows that
syntactically, this wh-word acts like the head noun of a relative clause. This poses a challenge
for the claim that (105b) is formed from a conjectural question. Moreover, if the wh-word is
functioning as a head noun in (105b), the evidential k’a should not be able to attach to it, as
9:59
Lisa Matthewson
7.4 ‘Pretend’
There are two patterns to account for with the ‘pretend’ cases, depending on
the dialect. In Upper St’át’imcets, the subjunctive plus the normative modal
ka frequently renders a ‘pretend to be ...’ interpretation. In Whitley et al. no
date, a native-speaker-produced St’át’imcets teaching manual, the standard
construction when the teacher is asking the students to pretend something
is that in (109).
(109) a. skalúl7=acw=ka: saq’w knáti7 múta7 em7ímn-em

owl=2sg.sbjn=deon fly deic and animal.noise-mid
‘Pretend to be an owl: fly around and hoot.’ (Davis 2006: chapter
24)
b. snu=hás=ka ku-skícza7
2sg.emph=3sbjn=deon det=mother
‘Pretend to be the mother.’ (Whitley et al. no date)
In Lower St’át’imcets, however, examples like the ones in (109) are rejected
in ‘pretend’ contexts. Lower St’át’imcets uses either an emphatic pronoun
in a cleft, as in (110a), or the adhortative particle malh, as in (110b). In each
case, the subjunctive is present, but ka is absent.
(110) a. nu=hás ku=skalúla7: sáq’w=kacw knáti7

2sg.emph=3sbjn det=owl fly=2sg.indic deic
‘Pretend to be an owl.’
b. skalúl7=acw=malh: sáq’w=kacw knáti7
owl=2sg.sbjn=adhort fly=2sg.indic deic
‘Pretend to be an owl: fly around.’
In each of the dialectal variants, the apparent ‘pretend’ construction

seems to reduce to another usage, rather than really meaning ‘pretend’.
The examples in (109) are merely instances of the subjunctive adding to a
normative modal assertion. (109a) thus really means something like ‘I wish
you were an owl’, and (109b) means ‘I wish you were the mother.’ In (110a),
the subjunctive adds to a plain assertion to create a wish, something which is
possible with clefts; cf. (5) above. As for (110b), the consultant spontaneously
k’a attaches only to predicates. This is a peculiarity of k’a; Davis shows that other second-
position evidentials, such as reportative ku7 or perceived-evidence =an’, are ungrammatical
in free relatives. Further research is required.
9:60
translates this into English as ‘You may as well be an owl’. The presence
of adhortative malh here is a matter for future research; see comments in
Section 8 below.
Support for the idea that (109) and (110) are not really ‘pretend’ construc-
tions comes from the fact that exactly parallel structures are used when the
wish is not that someone pretend to be something, but rather is a wish which
has a chance of coming true. This is shown in (111). While the consultant
accepts a ‘pretend’ translation for the sentences in (111), she spontaneously
translates them into English using simply ‘you be . . . ’. She judges that the
St’át’imcets sentences do not really mean ‘pretend’.
(111) a. nu=hás ku=kúkwpi7

2sg.emph=3sbjn det=chief
‘Pretend to be the chief.’ [accepted]
‘You be the chief.’ [spontaneously given]
b. nu=hás ku=kúkw
2sg.emph=3sbjn det=cook
‘Pretend to cook.’ [accepted]
‘You be the cook.’ [spontaneously given]
7.5 Why St’át’imcets is not like Romance
In this final sub-section I return to a major cross-linguistic difference between

the St’át’imcets subjunctive and more familiar, Indo-European subjunctives,
namely that in St’át’imcets the subjunctive is never selected by a matrix
predicate, and in fact is ungrammatical under all attitude verbs (as shown in
(38) above).
It turns out that this falls out from the current analysis. The St’át’imcets
subjunctive is parasitic on a modal, and introduces the presupposition that
in at least one of the best worlds in the modal base according to the ordering
source, the embedded proposition is false. This presupposition is incompati-
ble with the semantics of attitude verbs, which are standardly analyzed as
introducing universal quantification over a set of worlds. This is illustrated
in (112) for English believe.
(112) JbelieveKw,g =
λphs,ti .λx.∀w 0 compatible with what x believes in w : p(w 0 ) = 1
9:61
Lisa Matthewson
There is no reason to assume that attitude verbs like ‘believe’ have different
semantics in St’át’imcets from in English. On the contrary, the St’át’imcets
verb tsutánwas ‘think, believe’ must involve universal quantification over
belief-worlds, without the possibility of domain restriction (in other words,
there is no choice function or second ordering source). Thus, (113), just like
its English gloss, requires that in all Laura’s belief-worlds, John has left. It
cannot mean that Laura’s beliefs allow, but do not require, that John has left.
(113) tsut-ánwas k=Laura kw=s=qwatsáts=s k=John

say-inside det=Laura det=nom=leave=3poss det=John
‘Laura thinks that John left.’
Given this, adding the subjunctive under the verb ‘believe’ in St’át’imcets
leads to the following contradictory result.
(114) *tsut-ánwas k=Laura kw=s=qwatsáts=as k=John

say-inside det=Laura det=nom=leave=3sbjn det=John
‘Laura thinks that John left.’
J(114)Kw is only defined if ∃w 0 compatible with Laura’s beliefs in w:

John didn’t leave in w 0
If defined, J(114)Kw = 1 iff ∀w 0 compatible with Laura’s beliefs in w:

John left in w 0
The presupposition of the subjunctive contradicts the assertion. This explains

why the subjunctive is not used under verbs like ‘believe’ in St’át’imcets,
unlike in Romance.
We need to separately discuss the absence of subjunctive under desire
verbs in St’át’imcets. An example was given in (38e), repeated here.53
(115) xát’-min’-as k=Laura kw=s=t’iq=Ø k=John

want-red-3erg det=Laura det=nom=arrive=3indic det=John
Desire verbs are often treated as involving comparison between alternative

worlds (e.g., Stalnaker 1984, Heim 1992 and much subsequent work). The
intuition is that ‘John wants you to leave means that John thinks that if you
leave he will be in a more desirable world than if you don’t leave’ (Heim 1992:
53 Thanks to an anonymous reviewer for discussion of this issue.
9:62
193). Here I adopt Portner’s (1997) analysis of desire verbs, and in particular
we will see that the St’át’imcets verb xát’min’ is better analyzed as similar to
English hope (which according to Portner is similar to believe, and therefore
is not intrinsically comparative) than to English want.
Portner analyzes hope in terms of a buletic accessibility relation Bulα (s, b).
For any situation s and belief situation b of an agent α, Bulα (s, b) is the set
of buletic alternatives for α in s — i.e., ‘the worlds in which the most of α’s
plans in s (relative to his or her beliefs in b) are carried out’ (Portner 1997:
178). The sentence in (116) receives the interpretation shown: it is true just in
case in all of James’s buletic alternatives, Joan arrives in Richmond soon.
(116) James hopes that Joan arrives in Richmond soon.

{s : BulJames (s, b) ⊆ J Joan arrives in Richmond soon Ks }
(Portner 1997: 188)
Portner’s analysis of hope differs from that of want, and is parallel to that
of believe, in crucial respects (which explain the different embedding possi-
bilities for hope/believe vs. want). In particular, while hope and believe are
defined directly in terms of (doxastic or buletic) alternatives, want is defined
in terms of the agent’s plans. Portner argues that the difference between
hope and want is ‘an idiosyncratic lexical one’ (Portner 1997: 189). If this is
correct, it would not be unexpected that a language could contain only the
hope-type of desire predicate.
If we apply Portner’s analysis of hope to St’át’imcets xát’min’, and attempt
to use the subjunctive in the embedded clause, we get the result in (117).
(117) *xát’-min’-as k=Laura kw=s=t’íq=as k=John

want-red-3erg det=Laura det=nom=arrive=3sbjn det=John
J(117)Ks is only defined if ∃s ∈ BulLaura (s, b): John does not come in s
If defined, J(117)Ks =1 iff {s : BulLaura (s, b) ⊆ J John comes Ks }
(117) is defined only if there is at least one situation in Laura’s buletic alter-
natives in which John does not come, but it asserts that in all Laura’s buletic
alternatives, John comes. The contradiction between the presupposition and
the assertion leads to the unacceptability of the sentence.
9:63
Lisa Matthewson
The idea that St’át’imcets xát’min’ is parallel to English hope or believe

rather than to English want leads to the following cross-linguistic compari-
son. While Indo-European has two kinds of attitude verbs — those involving
universal quantification over alternative worlds, and those which are intrin-
sically comparative — St’át’imcets has only the former kind. This explains
why St’át’imcets lacks subjunctives under attitude verbs, and even allows us
to draw the broader generalization that St’át’imcets only allows universal
quantification over worlds. This language lacks both true possibility modals
and comparative subjunctive-embedding predicates.54
8 Conclusions and questions for future research
The goal of this paper was to extend the formal cross-linguistic study of
modality to the related domain of mood. Prior work on St’át’imcets has
proposed that languages vary in whether their modals encode quantifica-
tional force (as in English), or conversational background (as in St’át’imcets)
(Matthewson et al. 2007, Rullmann et al. 2008, Davis et al. 2009). Here, I have
argued that languages vary in their mood systems along the same dimension,
at least functionally. While some languages use moods to encode distinctions
of conversational background (buletic, deontic, etc.), St’át’imcets uses mood
to functionally achieve a restriction on modal quantificational force. (Of
course technically, both modals and moods in St’át’imcets restrict conver-
sational backgrounds: the modal force is always universal.) If this view is
correct, then each language-type draws on its moods and its modals together
to allow the full range of specifications. In other words, what modals don’t
encode, moods do. The simplified typological table is repeated here.
lexically encode lexically encode

quant. force conv. background
Indo-European modals moods

St’át’imcets moods modals
Table 5 Modal and mood systems
The analysis presented here raises some questions for future research.
One outstanding issue is the status of subjunctives with no overt licenser at
54 Thanks to an anonymous reviewer for discussion of this point.
9:64
all, as in (5)–(6). As noted earlier, these appear to be productive only in clefts.

It is not immediately obvious that a cleft contains a modal operator which
would license the subjunctive, so further investigation is required (although
see fn. 22).
A second interesting puzzle relates to subjunctive imperatives (see sub-
section 7.1). These seem to strongly prefer the presence of the adhortative
particle malh, which is normally optional in imperatives. Perhaps malh (which
has not previously been analyzed) is a modal, and perhaps its obligatoriness
reflects the licensing requirement of the subjunctive. But what consequence
would this have for the analysis provided above, which assumes that even
imperatives with no adhortative particle contain a concealed deontic modal?
This question cannot be answered without a real investigation of malh,
something which goes beyond the bounds of the current paper.
An even trickier element is the particle t’u7. t’u7 is the culprit in the
two uses of the subjunctive I have declined to analyze here, the ‘might as
well’ cases and the indifference free relatives. Like malh, t’u7 has not yet
been formally analyzed, but for t’u7 there are not even any clear descriptive
generalizations about its usage. It is often translated as ‘just’ or ‘still’, but also
occurs where there is no obvious English translation, or even any detectable
semantic contribution. t’u7 frequently appears with strong quantifiers, as in
(118a), is almost obligatory if one wants to express ‘only’, as in (118b), and is
also the St’át’imcets way to express ‘but’, as in (118c) (although here, unlike
in its other uses, it is not a second-position enclitic, and this may therefore
be a case of homophony).
(118) a. tákem=t’u7 swat áolsvm l=ti=tsítcw=a

all=prt who sick in=det=house=exis
‘Everyone in the house was sick.’ (Matthewson 2005: 311)
b. tsúkw=t’u7 snilh ti=tsícw=a aolsvm-áolhcw
finish=prt 3sg.emph det=get.there=exis sick-house
‘It was only him who went to the hospital.’ (Matthewson 2005:
324)
c. plan aylh láku7 wa7 cw7it i=tsetsítcw=a, t’u7
already then deic impf many det.pl=houses=exis but
pináni7 cw7aoz láti7 ku=wá7 tsitcw
temp.deic neg deic det=impf house
‘Now there are lots of houses there, but then there were no
houses.’
9:65
Lisa Matthewson
As noted above, t’u7 is present in the ‘might as well’ uses of the subjunc-
tive, and in indifference free relatives. Examples are repeated here.
(119) a. wá7=lhkacw=t’u7 lts7a lhkúnsa ku=sgáp

be=2sg.indic=prt deic now det=evening
‘You are staying here for the night.’
b. wá7=acw=t’u7 lts7a lhkúnsa ku=sgáp
be=2sg.sbjn=prt deic now det=evening
‘You may as well stay here for the night.’
(120) [stám’=as=t’u7 káti7 i=wá7 ka-k’ac-s-twítas-a

[what=3sbjn=prt deic det.pl=impf circ-dry-caus-3pl.erg-circ
i=n-slalíl’tem=a] wa7 ts’áqw-an’-em
det.pl=1sg.poss-parents=exis] impf eat-dir-1pl.erg
lh=as sútik
comp(impf)=3sbjn winter
‘Whatever my parents could dry, we ate in wintertime.’
(Matthewson 2005: 141, cited in Davis 2009)
Given the analysis above, we expect there to be a modal — or at least a

modal base and an ordering source — present in any structure where the
subjunctive is licensed. The interpretation of subjunctive + t’u7 in (119b) is
plausibly modal — the consultants are remarkably consistent with the ‘might
as well’ translation. There is also a certain similarity between the ‘might
as well’ construction and the Sufficiency Modal Construction (Krasikova &
Zchechev 2005, von Fintel & Iatridou 2008), illustrated in (121).
(121) To get good cheese, you only have to go to the North End!
The crucial elements of the Sufficiency Modal Construction are (a) a

necessity modal and (b) an exclusive operator such as ‘only’.55 The possible
connection between (119) and (121) may be fruitful to investigate in future
work.56
55 For von Fintel and Iatridou, the ‘only’ is decomposed into ‘neg . . . except’ (and shows up
overtly as this in some languages).
56 See also Mitchell 2003 on ‘might as well’ in English.
9:66
As for indifference free relatives as in (120), these also very plausibly con-
tain a covert modal, presumably a necessity one. The important question will
be whether the subjunctive can be analyzed as a weakener in the indifference
free relatives. Ideally, the future analysis of (119)–(120) will also elucidate
the semantic connection between the two t’u7-subjunctives, both of which
somehow express the notion of ‘indifference’ (although perhaps in different
senses of the word). (119b), for example, conveys that you can stay here for
the night or not, I don’t really care.
In spite of these outstanding questions, I believe that the empirical cover-
age of the analysis presented here is encouraging. Out of the nine meaningful
uses of the St’át’imcets subjunctive, we set aside two which rely on the poorly-
understood particle t’u7, but have managed to unify the remaining seven.
The analysis accounts for such seemingly disparate effects as the weakening
of imperatives, the reduction in interrogative force of questions, and the
non-appearance of the subjunctive under any attitude verb. The analysis, if
correct, supports the modal approach to mood advocated by Portner (1997),
and suggests that languages have a certain amount of freedom in how they
divide up the various functional tasks required of moods and modals.
Finally, the research reported on here opens up broader questions about
the nature of mood cross-linguistically, for example about the relation be-
tween subjunctive and irrealis. In Section 2, I showed that the St’át’imcets
subjunctive patterns morpho-syntactically, as well as in some of its semantic
properties, like a subjunctive rather than an irrealis. However, we also saw
that the St’át’imcets subjunctive differs semantically from Indo-European
subjunctives. I argued above (see fn. 9) that the use of the term ‘subjunctive’
was justified, even in the face of such non-trivial cross-linguistic variation.
However, there is much more work to be done on the formal semantics of
mood cross-linguistically. Once a wider range of systems are investigated
in depth, we may find that the traditional terminology does not correlate
with the cross-linguistically interesting divisions. Topics for future inquiry
include whether there is a minimal semantic change which would turn a
subjunctive morpheme into an irrealis one, or vice versa, and in general what
the semantic building blocks are from which moods are composed.
9:67
Lisa Matthewson
References
Aikhenvald, Alexandra. 2006. Evidentiality. New York: Oxford University

Press.
Baker, Mark & Lisa Travis. 1997. Mood as verbal definiteness in a
“tenseless” language. Natural Language Semantics 5(3). 213–269.
doi:10.1023/A:1008262802401.
Beghelli, Filippo. 1998. Mood and the interpretation of indefinites. The
Linguistic Review 15(2-3). 277–300. doi:10.1515/tlir.1998.15.2-3.277.
Bolinger, Dwight. 1968. Postposed main phrases: an English rule for the
Romance subjunctive. Canadian Journal of Linguistics 14. 3–33.
Caponigro, Ivano & Jon Sprouse. 2007. Rhetorical questions as questions. In
Proceedings of Sinn und Bedeutung 11, 121–133. http://idiom.ucsd.edu/
~ivano/Papers/2007_Rhetorical-Qs_SuB.pdf.
Condoravdi, Cleo. 2002. Temporal interpretation of modals: Modals for the
present and the past. In David Beaver, Stefan Kaufmann, Brady Clark &
Luis Casillas (eds.), Stanford Papers on Semantics, vol. 7, 59–88. Stanford:
CSLI Publications. http://semanticsarchive.net/Archive/2JmZTIwO/.
Davis, Christopher, Christopher Potts & Margaret Speas. 2007. The pragmatic
values of evidential sentences. In Masayuki Gibson & Tova Friedman (eds.),
Proceedings of the 17th Conference on Semantics and Linguistic Theory,
71–88. Ithaca, NY: CLC Publications. doi:1813/11294.
Davis, Henry. 2000. Remarks on Proto-Salish subject inflection. International
Journal of American Linguistics 66(4). 499–520. doi:10.1086/466439.
Davis, Henry. 2006. A grammar of Upper St’át’imcets. Ms., University of
British Columbia.
Davis, Henry. 2009. Free relatives in St’át’imcets (Lillooet Salish). Ms., Univer-
sity of British Columbia.
Davis, Henry, Lisa Matthewson & Hotze Rullmann. 2009. ‘Out of control’
marking as circumstantial modality in St’át’imcets. In Lotte Hogeweg,
Helen de Hoop & Andrey Malchukov (eds.), Cross-linguistic semantics of
tense, aspect and modality, 205–244. Oxford: John Benjamins. http://
www.linguistics.ubc.ca/sites/default/files/TamTam_final_11-08-08.pdf.
Dayal, Veneeta. 1997. Free relatives and ever: Identity and free choice read-
ings. In Proceedings of SALT VII, 99–116. http://www.rci.rutgers.edu/
~dayal/ever.pdf.
van Eijk, Jan. 1997. The Lillooet language: Phonology, morphology, syntax.
Vancouver, BC: UBC Press.
9:68
van Eijk, Jan & Lorna Williams. 1981. Lillooet legends and stories. Mt. Currie,
BC: Ts’zil Publishing House.
Faller, Martina. 2002. Semantics and pragmatics of evidentials in Cuzco
Quechua: Stanford dissertation.
Faller, Martina. 2006. Evidentiality and epistemic modality at the se-
mantics/pragmatics interface. http://www.eecs.umich.edu/~rthomaso/
lpw06/fallerpaper.pdf.
Farkas, Donka. 1992. On the semantics of subjunctive complements. In Paul
Hirschbühler & Konrad Koerner (eds.), Romance languages and modern
linguistic theory: Papers from the 20th linguistic symposium on Romance
languages, 69–104. Amsterdam and Philadelphia: Benjamins.
Farkas, Donka. 2003. Assertion, belief and mood choice. Paper presented at
the Workshop on Conditional and Unconditional Modality, ESSLLI, Vienna.
http://people.ucsc.edu/~farkas/papers/mood.pdf.
von Fintel, Kai. 2000. Whatever. In Proceedings of SALT X, 27–40. http:
//web.mit.edu/fintel/www/whatever.pdf.
von Fintel, Kai & Anthony Gillies. 2010. Must . . . stay . . . strong! Natural
Language Semantics. doi:10.1007/s11050-010-9058-2.
von Fintel, Kai & Irene Heim. 2007. Intensional semantics lecture notes. Ms.,
MIT. http://mit.edu/fintel/IntensionalSemantics.pdf.
von Fintel, Kai & Sabine Iatridou. 2008. How to say ought in foreign: The
composition of weak necessity modals. In Jacqueline Guéron & Jacqueline
Lecarme (eds.), Time and modality, 115–141. Dordrecht: Springer. http:
//mit.edu/fintel/fintel-iatridou-2006-ought.pdf.
Garrett, Edward. 2001. Evidentiality and assertion in Tibetan. Los Angeles,
CA: UCLA dissertation.
Gauker, Christopher. 1998. What is a context of utterance? Philosophical
Studies 91(2). 149–172. doi:10.1023/A:1004247202476.
Giannakidou, Anastasia. 1997. The landscape of polarity items. Groningen:
University of Groningen dissertation.
Giannakidou, Anastasia. 1998. Polarity sensitivity as (non)veridical depen-
dency. Amsterdam and Philadelphia: John Benjamins.
Giannakidou, Anastasia. 2009. The dependency of the subjunctive re-
visited: Temporal semantics and polarity. Lingua 119(12). 1883–1908.
doi:10.1016/j.lingua.2008.11.007.
Giorgi, Alessandra & Fabio Pianesi. 1997. Tense and aspect: From semantics
to morpho-syntax. Oxford: Oxford University Press.
Guerzoni, Elena. 2003. Why ‘even’ ask? on the pragmatics of questions and
9:69
Lisa Matthewson
the semantics of answers: MIT dissertation. http://hdl.handle.net/1721.1/

17646.
Hamblin, C. L. 1973. Questions in Montague English. Foundations of Language
10(1). 45–53. http://www.jstor.org/stable/25000703.
Han, Chung-hye. 1997. Deontic modality of imperatives. Language and
Information 1. 107–136.
Han, Chung-hye. 1999. Deontic modality, lexical aspect and the semantics
of imperatives. In Linguistics in the morning calm 4, Seoul: Hanshin
Publications. URLhttp://www.sfu.ca/~chunghye/papers/morningcalm.
pdf.
Han, Chung-hye. 2002. Interpreting interrogatives as rhetorical questions.
Lingua 112(3). 201–229. doi:10.1016/S0024-3841(01)00044-4.
Haverkate, Henk. 2002. The syntax, semantics and pragmatics of Spanish
mood. Amsterdam and Philadelphia: John Benjamins.
Heim, Irene. 1992. Presupposition projection and the semantics of attitude
verbs. Journal of Semantics 9(3). 183–221. doi:10.1093/jos/9.3.183.
Hooper, Joan B. 1975. On assertive predicates. In John Kimball (ed.), Syntax
and semantics 4, 91–124. New York: Academic Press.
Jacobs, Peter. 1992. Subordinate clauses in Squamish: A Coast Salish language.
MA thesis, University of Oregon.
Jacobson, Pauline. 1995. On the quantificational force of English free relatives.
In Emmon Bach, Eloise Jelinek, Angelika Kratzer & Barbara Partee (eds.),
Quantification in natural language, 451–486. Dordrecht: Kluwer.
James, Frances. 1986. Semantics of the English subjunctive. Vancouver, BC:
UBC Press.
Klein, Flora. 1975. Pragmatic constraints in distribution: the Spanish subjunc-
tive. In Papers from the 11th CLS, 353–365.
Krasikova, Sveta & Ventsislave Zchechev. 2005. Scalar uses of only in con-
ditionals. In Proceedings of the fifteenth Amsterdam Colloquium, 137–
142. University of Amsterdam. http:www.ventsislavzhechev.eu/Home/
Publications_files/.
Kratzer, Angelika. 1981. The notional category of modality. In Hans-Jürgen
Eikmeyer & Hannes Rieser (eds.), Words, worlds, and contexts: New ap-
proaches in word semantics (Research in Text Theory 6), 38–74. Berlin: de
Gruyter.
Kratzer, Angelika. 1991. Modality. In Dieter Wunderlich & Arnim von Stechow
(eds.), Semantics: An international handbook of contemporary research,
639–650. Berlin: de Gruyter.
9:70
Kratzer, Angelika. 2009. Modals and conditionals again, chapter 3. To be

published by Oxford University Press.
Kroeber, Paul. 1999. The Salish language family: Reconstruct-
ing syntax. Lincoln, NE: The University of Nebraska Press.
doi:10.1017/S0022226702231928.
Littell, Patrick. 2009. Conjectural questions and the wonder effect or: What
could conjectural questions possibly be? Ms, University of British
Columbia.
Littell, Patrick, Lisa Matthewson & Tyler Peterson. 2009. On the semantics of
conjectural questions. Paper presented at the MOSAIC Workshop (Meeting
of Semanticists Active in Canada), Ottawa.
Lunn, Patricia. 1995. The evaluative function of the Spanish subjunctive.
In Joan Bybee & Suzanne Fleischman (eds.), Modality and grammar in
discourse, 419–449. Amsterdam and Philadelphia: Benjamins.
Matthewson, Lisa. 1998. Determiner systems and quantificational strategies:
Evidence from Salish. The Hague: Holland Academic Graphics.
Matthewson, Lisa. 1999. On the interpretation of wide-scope indefinites.
Natural Language Semantics 7(1). 79–134. doi:10.1023/A:1008376601708.
Matthewson, Lisa. 2005. When I was small – i wan kwikws: Grammatical
analysis of St’át’imcets oral narratives. Vancouver, BC: UBC Press.
Matthewson, Lisa. 2006. Presuppositions and cross-linguistic variation. In
Proceedings of NELS 36, Amherst, Mass: GLSA Publications.
Matthewson, Lisa. 2008a. Moods vs. modals in St’át’imcets and beyond. Paper
presented at New York University.
Matthewson, Lisa. 2008b. Pronouns, presuppositions and semantic vari-
ation. In Proceedings of SALT XVIII, 527–550. Cornell University:
CLC Publications. http://www.linguistics.ubc.ca/sites/default/files/
MatthewsonSALTpronouns.pdf.
Matthewson, Lisa. 2010. Evidence about evidentials: Where fieldwork
meets theory. Paper presented at Linguistic Evidence 2010, Uni-
versity of Tübingen. http://www.linguistics.ubc.ca/sites/default/files/
MatthewsonLE2010.pdf.
Matthewson, Lisa. to appear. On apparently non-modal evidentials. To appear
in Proceedings of CSSP 2009 (EISS8).
Matthewson, Lisa, Hotze Rullmann & Henry Davis. 2007. Evidentials as
epistemic modals: Evidence from St’át’imcets. In J.V. Craenenbroeck (ed.),
Linguistic Variation Yearbook, vol. 7, 201–254. John Benjamins Publishing
Company.
9:71
Lisa Matthewson
Mitchell, Keith. 2003. Had better and might as well: On the margins of modal-
ity? In M. Krug R. Facchinetti & F. Palmer (eds.), Modality in contemporary
english, 129–149. Berlin: Mouton de Gruyter.
Murray, Sarah. to appear. Evidentiality and questions in Cheyenne. In Suzi
Lima (ed.), Proceedings of SULA 5: Semantics of under-represented lan-
guages in the Americas, Amherst, MA: GLSA Publications.
Palmer, Frank. 2006. Mood and modality. Cambridge: Cambridge University
Press 2nd edn. doi:10.2277/0521804795.
Panzeri, Francesca. 2003. In the (indicative or subjunctive) mood. In Pro-
ceedings of Sinn und Bedeutung 7, http://ling.uni-konstanz.de/pages/
conferences/sub7/proceedings/download/sub7_panzeri.pdf.
Peterson, Tyler. 2009. The ordering source and graded modality in Gitskan
epistemic modals. Ms., University of British Columbia. http://www.
linguistics.ubc.ca/sites/default/files/Peterson(SuB).pdf.
Peterson, Tyler. 2010. Epistemic modality and evidentiality in Gitksan at the
semantics-pragmatics interface: University of British Columbia disserta-
tion. http://hdl.handle.net/2429/23596.
Portner, Paul. 1997. The semantics of mood, complementation and
conversational force. Natural Language Semantics 5(2). 167–212.
doi:10.1023/A:1008280630142.
Portner, Paul. 2003. The semantics of mood. In Lisa Cheng & Rint Sybesma
(eds.), The second Glot international state-of-the-article book, 47–77. Berlin:
Mouton de Gruyter.
Portner, Paul. 2004. The semantics of imperatives within a theory of clause
types. In Proceedings of SALT XIV, Cornell University: CLC Publications.
http://semanticsarchive.net/Archive/mJlZGQ4N/PortnerSALT04.pdf.
Portner, Paul. 2007. Imperatives and modals. Natural Language Semantics
15(4). 351–383. doi:10.1007/s11050-007-9022-y.
Portner, Paul. 2009. Modality Oxford Surverys in Semantics and Pragmatics.
Oxford: Oxford University Press.
Potts, Christopher. 2005. The logic of conventional implicatures. Oxford:
Oxford University Press.
Quer, Josep. 1998. Mood at the interface. The Hague: Holland Academic
Graphics.
Quer, Josep. 2001. Interpreting mood. Probus 13(1). 81–111.
doi:10.1515/prbs.13.1.81.
Quer, Josep. 2009. Twists of mood: The distribution and interpre-
tation of indicative and subjunctive. Lingua 119(12). 1779–1787.
9:72
doi:10.1016/j.lingua.2008.12.003.
Rivero, María. 1975. Referential properties of Spanish noun phrases. Language
51(1). 32–48. doi:10.2307/413149.
Rocci, Andrea. 2007. Epistemic modality and questions in dialogue. the
case of Italian interrogative constructions in the subjunctive mood. In
L. de Saussure, J. Moeschler & G. Puska (eds.), Tense, mood and aspect:
Theoretical and descriptive issues, 129–153. Amsterdam and New York:
Rodopi.
Rullmann, Hotze, Lisa Matthewson & Henry Davis. 2008. Modals as
distributive indefinites. Natural Language Semantics 16(4). 317–357.
doi:10.1007/s11050-008-9036-0.
Schwager, Magdalena. 2005. Interpreting imperatives: University of Frank-
furt/Main dissertation.
Schwager, Magdalena. 2006. Conditionalized imperatives. In Proceedings of
SALT XVI, Cornell University: CLC Publications. http://ecommons.library.
cornell.edu/bitstream/1813/7591/1/salt16_schwager_241_258.pdf.
Schwager, Magdalena. 2008. Optimizing the future - imperatives between
form and function. Course notes, ESLLI 2008. http://zis.uni-goettingen.
de/mschwager/esslli08/ms_schwager_esslli08.pdf.
Stalnaker, Robert. 1974. Pragmatic presuppositions. In Milton Munitz & Peter
Unger (eds.), Semantics and Philosophy, 197–214. New York University
Press.
Stalnaker, Robert. 1984. Inquiry. Cambridge, MA: MIT Press.
Tenny, Carol. 2006. Evidentiality, experiencers and the syntax of sen-
tience in Japanese. Journal of East Asian Linguistics 15(3). 245–288.
doi:10.1007/s10831-006-0002-x.
Tenny, Carol & Peggy Speas. 2004. The interaction of clausal syntax, discourse
roles and information structure in questions. Paper presented at the Work-
shop on Syntax, Semantics and Pragmatics of Questions. ESLLI, Université
Henri Poincaré, Nancy. http://www.linguist.org/ESSLI-Questions-hd.pdf.
Terrell, Tracy & Joan Hooper. 1974. A semantically based analysis of mood in
Spanish. Hispania 57(3). 484–494. doi:10.2307/339187.
Thoma, Sonja. 2007. The categorical status of independent pronouns in
St’át’imcets. Ms., University of British Columbia.
Villalta, Elisabeth. 2009. Mood and gradability: an investigation of the
subjunctive mood in Spanish. Linguistics and Philosophy 31(4). 467–522.
doi:10.1007/s10988-008-9046-x.
Whitley, Rose (translator), Henry Davis, Lisa Matthewson & Beveley Frank
9:73
Lisa Matthewson
(editors). no date. Teaching St’át’imcets Through Action. Translation of

Bertha Segal Cook Teaching English Through Action. Upper St’át’imcets
Language, Culture and Education Society.
Lisa Matthewson
UBC Department of Linguistics
Totem Field Studios
2613 West Mall
Vancouver, BC, V6T 1Z4, Canada
lisamatt@interchange.ubc.ca
9:74
doi: 10.3765/sp.3.10
Free choice permission as resource-sensitive reasoning∗
Chris Barker
New York University
Received 2009-10-14 / First Decision 2009-11-24 / Revised 2010-07-04 / Accepted

2010-08-14 / Final Version Received 2010-08-31 / Published 2010-09-01
Abstract Free choice permission is a long-standing puzzle in deontic logic

and in natural language semantics. It involves what appears to be a conjunc-
tive use of or: from You may eat an apple or a pear, we can infer that You
may eat an apple and that You may eat a pear — though not that You may
eat an apple and a pear. Following Lokhorst (1997), I argue that because
permission is a limited resource, a resource-sensitive logic such as Girard’s
Linear Logic is better suited to modeling permission talk than, say, classical
logic. A resource-sensitive approach enables the semantics to track not only
that permission has been granted and what sort of permission it is (i.e.,
permission to eat apples versus permission to eat pears), but also how much
permission has been granted, i.e., whether there is enough permission to
eat two pieces of fruit or only one. The account here is primarily semantic
(as opposed to pragmatic), with no special modes of composition or special
pragmatic rules. The paper includes an introduction to Linear Logic.
Keywords: Free choice, permission, linear logic, deontic, implicature, resource-

sensitive, substructural
∗ Thanks to Simon Charlow, Emmanuel Chemla, Cleo Condoravdi, Judith Degen, Nicholas
Fleisher, Sven Lauer, Koji Mineshima, Paul Portner, Daniel Rothschild, Philippe Schlenker,
Chung-chieh Shan, Seth Yalcin, and my anonymous referees.
©2010 Chris Barker

Chris Barker
1 The resource-sensitivity of permission talk
Since Ross 1941, it has been clear that the logic of obligation and permission
behaves dramatically differently than other sorts of ordinary reasoning:
(1) a. You may eat an apple or a pear.

b. You may eat an apple.
c. You may eat a pear.
If (1a) is true, then it is certainly true that you may eat an apple. Likewise, it
is equally true that you have it within your power to safely eat a pear. So an
adequate account of the meaning of (1a) must explain how it comes to imply
(1b) and (1c).
This pattern is by no means the usual case. Consider a variation on (1) in
which the permissive modal may is omitted:
(2) a. You ate an apple or a pear.

b. You ate an apple.
c. You ate a pear.
In this case, (2a) certainly does not imply either (2b) or (2c). So something
about permission talk correlates with the unusual implications we are con-
cerned with here.
The puzzle posed by the facts in (1) is known as the free choice permission
problem (Kamp (1973) attributes the choice of name to von Wright).
Since (1a) implies both (1b) and (1c), (1b) and (1c) are therefore both equally
true. Thus in many discussions, (1a) is said to imply (3a), since (3a) is merely
the conjunction of (1b) and (1c):
(3) a. You may eat an apple and you may (also) eat a pear.
b. You may eat an apple or you may (*also) eat a pear.
Crucially, however, (3a) has an interpretation on which it furnishes permission

to eat more than one piece of fruit. This interpretation is the one compatible
with adding also in the second conjunct. Now, although (1a) may be consistent
with a situation in which the addressee is allowed to eat more than one piece
of fruit (as we will see below), the truth of (1a) alone is never sufficient to
guarantee that more than one piece of fruit may be eaten. As a result, (3b) is
a better candidate for a paraphrase of (1a): it, too (surprisingly!) implies (1b)
and (1c), but, like (1a), it does not ever justify eating more than one piece of
10:2
Free choice permission as resource-sensitive reasoning
fruit. This is why also is never appropriate in the second disjunct in (3b) on
the intended reading.
What I am suggesting is that a complete characterization of permission
sentences must not only tell us whether permission exists and what type of
permission it is (i.e., permission to eat an apple versus permission to eat a
pear), it must also characterize how much permission has been granted. Thus
it must predict that (1a) and (3b) guarantee permission only to eat one piece
of fruit, but that (3a) can be used to provide permission to eat two pieces of
fruit.
The key insight that I would like to develop in this paper first appears,
as far as I know, in unpublished work of Lokhorst (1997): that permis-
sion and obligation is a resource-sensitive domain, so that logics based on
(resource-insensitive) classical logic are not appropriate. Lokhorst suggests
using Girard’s (1987) Linear Logic instead, and I will follow the technical
details of his proposal closely. The contribution of this paper will be to
introduce Lokhorst’s work to a linguistic audience, to evaluate it with respect
to competing linguistic analyses, and to investigate the implications of adapt-
ing Lokhorst’s proposal for the theory of natural language semantics and
pragmatics.
Resource-sensitive (‘substructural’) logics are already familiar in linguis-
tics as tools for building syntax/semantics interfaces (e.g., Moortgat 1997
or Dalrymple 2001). As far as I know, however, no one has yet suggested
that natural language connectives such as or or and can have uses in which
they behave semantically like connectives in a substructural logic, as I am
suggesting here.
Kamp (1973, 1978) discusses free choice permission not just as a puzzle
for modeling reasoning about obligation (deontic logic), but as a puzzle
for the composition of natural language expressions. From the point of
view of natural language semantics, the interesting thing about the free
choice permission problem is that it appears to require not only making
assumptions about the meaning of certain uses of modal expressions such as
may, but about the meaning of the corresponding uses of the coordinating
conjunctions and and or. This will be true of the solution I offer below.
Many solutions to the free choice permission problem rely on pragmatic
mechanisms for much of the heavy lifting, including Kamp 1978, Zimmer-
mann 2000, Fox 2007, and others. The arguments that free choice implica-
tions are pragmatic, and more specifically are scalar implicatures, stem from
discussions of indefinites in Kratzer and Shimoyama 2002, as developed by
10:3
Chris Barker
Alonso-Ovalle (2006) and Fox (2007). The main evidence that free choice
implications may be scalar implicatures turns on the behavior of negated
permission sentences (You may not eat an apple or a pear); I show how the
analysis here can explain the behavior of such sentences in section 5.
In contrast to the pragmatic approaches, I will argue that the main free
choice implications, including especially the implications from (1a) to (1b)
and to (1c), are matters of entailment. To the extent that the analysis here
is viable, it calls into question whether free choice implications are indeed
implicatures. I discuss other entailment approaches (e.g., Aloni 2007) in
section 6.2.
2 Classical logic versus Linear Logic
The account of free choice given below will depend on understanding the
basics of Linear Logic at a fairly deep level. Since Linear Logic is unfamiliar
to most semanticists, this section will present the basics of Linear Logic.
2.1 Classical logic
I will only introduce the elements of classical logic that will be relevant for
comparison with Linear Logic in the discussion below. This will include
conjunction, disjunction, negation, and Weakening, but not, for example,
quantification.
Formulas. There is a set of atomic formulas a, b, c, . . . , and a set of
variables over formulas A, B, C, . . . . Assume A and B are formulas. Then the
classical negation of A, written ¬A, is a formula; the classical conjunction of
A and B, written A ∧ B, is a formula; and the classical disjunction of A and B,
written A ∨ B, is a formula. In addition, the classical implication of A and B,
written, A → B is defined as an abbreviation of (¬A) ∨ B.
Sequents. A sequent A, B, . . . , M ` N, O, . . . , Z consists of two multisets
of formulas joined by a turnstile (‘`’). Classical sequents are interpreted as
asserting that whenever all of the formulas in the leftmost multiset hold,
then at least one of the formulas in the rightmost multiset must also hold.
Saying that a sequent contains multisets rather than lists of formulas means
that the order in which formulas are written is immaterial. Thus A, B and
B, A represent the same multiset, but A, B is a different multiset than A, A, B,
since the second multiset contains two instances of the formula A.
10:4
Capital Greek letters (∆, Γ , . . . ) schematize over (possibly empty) multisets

of formulas. The turnstile can occur in any position, and there can be more
than one formula on the right hand side, so that the expression ‘∆ ` A, B’,
the expression ‘∆ `’, and the expression ‘` ∆’ are all legitimate sequents.
Negation. The following pair of inference rules characterize classical
negation:
∆, A ` Γ ∆ ` A, Γ
¬1 ¬2
∆ ` ¬A, Γ ∆, ¬A ` Γ
Beginning with ¬1 , the inference rule on the left: if Γ follows from the
formulas in ∆ along with A (this is what the sequent above the horizontal
line expresses), then from ∆ alone we can conclude that either some member
of Γ is still true, or else A must be false (the sequent below the horizontal
line). Similar reasoning applies for the inference rule on the right, ¬2 .
Proofs. A proof that a sequent is valid begins with trivial tautologies,
here, that A ` A:
AÀ
¬1
` ¬A, A
¬2
¬¬A ` A
As long as each subsequent inference step instantiates a valid inference rule,
the proof guarantees that the final sequent will also be valid. A sequent at
the bottom of such a proof is called a theorem of the logic.
Reading from top to bottom, the first step of the proof here is an in-
stantiation of the inference rule ¬1 . This step concludes that either A or its
negation must be true (a version of the law of excluded middle); the second
step (labeled ¬2 ) proves that two adjacent negations cancel out (the law of
double negation). Proving that A ` ¬¬A is equally easy.
Conjunction. The inference characterizing classical conjunction has two
premises:
∆À ∆`B
∧
∆À∧B
If the assumptions in ∆ allow you to prove that A is true (i.e., if ∆ ` A), and
the very same set of assumptions also allow you to prove that B is true, then
you are certainly in a position to assert that the classical conjunction of A
and B must be true.
Disjunction. For disjunction, we have a matched pair of inferences:
∆À ∆`B
∨1 ∨2
∆À∨B ∆À∨B
10:5
Chris Barker
If the assumptions in ∆ allow you to prove that some proposition A is true,

you can conclude that the classical disjunction of A and B is true. After all,
if you know that Ann arrived, then you know that either Ann arrived or Bill
arrived. The reason we need a pair of rules is that disjunction is symmetric,
i.e., we are free to add the new disjunct either on the left or on the right.
The classical duality of conjunction and disjunction. The following
equivalences hold:
(4) a. ¬¬A ≡ A
b. ¬(A ∧ B) ≡ ¬A ∨ ¬B
c. ¬(A ∨ B) ≡ ¬A ∧ ¬B
The last two (DeMorgan’s laws) express the logical interrelationship between
disjunction and conjunction. These equivalences can be thought of as bi-
directional inference rules. In any case, I will freely replace formulas with
forms deemed equivalent by (4).
Weakening. Weakening allows assumptions to be discarded.
∆`Γ
Weak
∆, A ` Γ
If Γ follows from ∆, then Γ certainly still follows if A also happens to be true,

no matter what A happens to express. The assumption A is gratuitous, but
harmless. Weakening allows us to pick and choose among evidence as we
focus on different parts of an argument.
Implication as a form of disjunction. Recall that in the definitions of
well-formed formulas, we defined classical implication A → B as an abbrevia-
tion of ¬A ∨ B. The inference rule that characterizes implication is Modus
Ponens, which says that A, A → B ` B is valid. We can prove Modus Ponens
as follows. The main aspect of the proof that is relevant for comparison with
Linear Logic is the role of Weakening.
AÀ ¬B ` ¬B
Weak Weak
A, ¬B ` A ¬B, A ` ¬B
∧
¬B, A ` A ∧ ¬B
¬1 , ¬2
A, ¬(A ∧ ¬B) ` ¬¬B
≡
A, A → B ` B
10:6
Wadler (1993) uses classical modus ponens in the following proof to

emphasize the differences between classical logic and Linear Logic:
AÀ [see previous proof]

Weak
A, A → B ` A A, A → B ` B
∧
A, A → B ` A ∧ B
Weakening allows us to make use of assumption A twice: once to justify the

left conjunct of the conclusion, and once to support modus ponens in order
to derive the right conjunct of the conclusion. We will see that Linear Logic
requires careful accounting: each assumption can be used exactly once, so
this proof will not go through.
Finally, completing the ¬, ∧, ∨ fragment of classical logic requires Con-
traction: from ∆ ` A, A, infer ∆ ` A. In Linear Logic, Contraction is also
rejected, but Contraction does not play a role in the exposition here.
2.2 Linear Logic
Formulas. Once again there is a set of atomic formulas a, b, c, . . . , and a

set of variables over formulas A, B, C, . . . . However, since none of the Linear
Logic connectives mean what their classical counterparts mean, Linear Logic
uses a completely distinct set of connective symbols. Assume A and B are
formulas. Then the linear negation of A, written A⊥ , is a formula; the additive
conjunction of A and B, written A & B (pronounced “A with B”) is a formula;
the multiplicative conjunction of A and B, written A ⊗ B (pronounced “A
times B”) is a formula; the additive disjunction of A and B, written A ⊕ B
(pronounced “A plus B”) is a formula; and the multiplicative disjunction of
&
A and B, written A B (pronounced “A par B”) is a formula. (Many things in
natural language semantics are called ‘additive’. The Linear Logic notions of
‘additive’ and ‘multiplicative’ do not line up with any of them.) In parallel
with the definition of classical implication above, linear implication, written
&
A ( B (pronounced “A lollipop B”), is defined as an abbreviation for A⊥ B.
Sequents. A sequent ∆ ` Γ says that whenever the multiplicative con-
junction of ∆ holds, then the multiplicative disjunction of Γ must hold.
Fragment of Linear Logic for the free choice permission problem. Fig-
ure 1 displays the complete set of rules of Linear Logic that we will use in the
discussion of the free choice permission problem.
10:7
Chris Barker
∆, A ` Γ ∆ ` A, Γ
⊥1 ⊥2
∆ ` A⊥ , Γ ∆, A⊥ ` Γ
Axiom
AÀ
A ( B ≡ A⊥
&
B
A⊥⊥ ≡ A
(A & B)⊥ ≡ A⊥ ⊕ B ⊥ (A ⊗ B)⊥ ≡ A⊥ B⊥
&
(A ⊕ B)⊥ ≡ A⊥ & B ⊥ B)⊥ ≡ A⊥ ⊗ B ⊥

&
(A
∆À ∆`B ∆À Γ `B

& ⊗
∆À&B ∆, Γ ` A ⊗ B
∆À ∆`B ∆ ` A, B &

⊕1 ⊕2 &
∆À⊕B ∆À⊕B ∆À B
Figure 1 Fragment of Linear Logic for FCP
10:8
Linear conjunction and disjunction. The rules for & and ⊕ (the ‘additive’
connectives) look exactly like the classical rules for ∧ and ∨, except for the
substitution of & for ∧ and of ⊕ for ∨. However, as a result of how they
interact with the rest of the logic, the linear logic additives behave differently
from their classical counterparts. For instance, the law of the excluded
middle is valid for classical disjunction: ` (¬A) ∨ A. In Linear Logic, the law
of excluded middle is not valid for additive disjunction, despite the fact that
the inference rule for additive disjunction has the same form as the inference
rule for classical disjunction: 6` A⊥ ⊕ A. However, the excluded middle is
&
valid for multiplicative disjunction (` A⊥ A).
Linear negation. We have direct analogs to the classical rules for pushing
a formula across the turnstile, namely, ⊥1 and ⊥2 . Since we now have two
kinds of conjunctions and two kinds of disjunctions, there are more duality
equivalences; however, each conjunction is still dual to a disjunction, and
vice-versa.
Linear implication. Once again, we have defined implication in terms of
disjunction. Now, interestingly, we can prove the linear version of Modus
Ponens without using Weakening (which is a good thing, since Weakening is
not allowed in Linear Logic):
AÀ B⊥ ` B⊥
⊗
A, B ⊥ ` A ⊗ B ⊥
⊥1 , ⊥2
A, (A ⊗ B ⊥ )⊥ ` B ⊥⊥
≡
A, A ( B ` B
Because the inference rule for ⊗ splits up the resources (that is, the formulas)
into those used to prove A and those used to prove B, there is no need to
ignore gratuitous assumptions via Weakening.
If we try to reproduce Wadler’s classical proof from the previous section,
we’re out of luck:
?? ` A ?? ` B
⊗
A, A ( B ` A ⊗ B
We could take some of the resources to the left of the turnstile to prove A,
and we could take some (actually, we would need all) of the resources to
prove B, but no matter how we divide up the left-hand formulas, we’ll fall
short of proving one or the other of the conjuncts. Linear Logic requires
strict accounting of assumptions, and we can’t make use of A twice, the way
we could in the classical proof.
10:9
Chris Barker
2.3 Choice
Since free choice permission is about making choices, what does Linear Logic
have to say about choice?
The critical connectives will be the additive conjunction ‘&’ and its (also
additive) disjunctive dual, ‘⊕’. The relevant inference rules are repeated here:
∆À ∆`B ∆À ∆`B
& ⊕1 ⊕2
∆À&B ∆À⊕B ∆À⊕B
Imagine yourself in the role of the prover. Then the assumptions on the left
of the turnstile are what your environment gives you to work with, and the
conclusion on the right of the turnstile is what you return as the result of
your labors (perhaps to be used as an assumption in a larger proof).
So here is what the & inference says: if the resources in ∆ allow you to
provide A, and if the same resources allow you to provide B, then you can
certainly offer to provide either A or B. Furthermore, since you are prepared
to provide either alternative, you can leave the choice up to whoever might
be interested in making use of the conclusion. Thus & conjoins two equally
viable alternatives.
Though both alternatives are equally viable, the consumer is forced to
choose between them. For instance, imagine that ∆ contains a certain amount
of sugar and a certain number of eggs. Using the resources provided, you
can construct either a meringue or else an angel food cake, but you don’t
have enough ingredients to cook both. Being as flexible and gracious as
possible, you offer “meringue & cake” for dessert, and you let your guest
choose. Tellingly, “meringue & cake” is pronounced “meringue or cake” in
idiomatic English (this is a point that we will return to in section 7.3).
In the context of granting permission, the consumer is the entity to which
permission has been granted: we shall see that (unembedded) & corresponds
to free choice on the part of the entity given permission.
Continuing with our investigation of choice in Linear Logic, turning to
the ⊕1 inference rule, if the resources in ∆ allow you to provide A, then you
can certainly offer to provide either A ⊕ B — as long as you remain in control
of which of the alternatives is chosen. You may only know how to make
one dessert, perhaps. You can truthfully promise that dessert will either be
meringue or else Baked Alaska, although you know in advance that it will
have to be meringue. (Analogously with the roles reversed for ⊕2 .)
In the context of granting permission, offering A ⊕ B does not give the
grantee free choice.
10:10
In order to complete the picture of the dualities of & and ⊕, we must

consider what happens on the other side of the turnstile. Hopping across the
turnstile involves negation, which exchanges & for ⊕ (and vice versa).
A`∆ B`∆ A`∆ B`∆
A&B `∆ A&B `∆ A⊕B `∆

These rules follow from the official inference rules by applications of ⊥1 and
⊥2 .
If A alone is enough to enable you to provide ∆, then if someone promises
you A & B, you can certainly commit to providing ∆: just select A when they
give you your choice. (Similarly for the other rule introducing & on the left of
the turnstile.)
Finally, if having A is enough for you to be able to offer ∆, and if having
B is likewise enough for you to be able to offer ∆, then you’re in a position to
promise ∆ even if all you can count on is A ⊕ B. All you know is that you’ll
get either an A or a B, and that which one you get will be someone else’s
choice. However, since you are prepared to cope with either possibility, you
can commit to providing ∆.
The bottom line is that & and ⊕ are two perspectives on a single choice,
differing only in who has the power to make the selection: & provides two
equally legitimate alternatives, but forces an unconstrained (free) choice
between them; ⊕ also provides two alternatives, but reserves the choice for
whoever is providing the resource.
3 Strong permission versus weak permission
Standard deontic logics introduce unary modalities representing obligation

() and permission (♦), and add axioms that characterize an appropriate set of
entailments, usually including at least K and D, though there is considerable
variation; see McNamara 2006 or Portner 2009a for an introduction to deontic
logic. Lokhorst (1997) chooses instead a strategy attributed independently
to Anderson and to Kanger called deontic reduction. Deontic reduction
depends on a special proposition δ (pronounced “yay”), glossed as ‘the good
thing’, or ‘all things are as required’. Thus δ is roughly analogous to Kratzer’s
(e.g., 1991) notion of an ordering source, that is, the set of propositions that
characterize how things ought to be.
Then A is obligatory iff δ ( A: if A follows from the state where all things
are as required, then A is required. Dually, a weak version of permission
10:11
Chris Barker
is often defined as (δ ( A⊥ )⊥ : if the negation of A is not obligatory, then

A is at least not forbidden. However, there is a difference between weak
permission, which is the absence of prohibition, and strong permission, i.e.,
a permissive norm (as discussed in, e.g., Hansen et al. 2007), which is the
assertion that some action is explicitly ok.
Lokhorst (1997) renders strong permission as A ( δ. Viewed from the
linguistics tradition, it is not so easy to make sense out of this as a statement
of permission (as discussed in Portner 2009a:60). It is important to bear in
mind that the ‘strong’ part of ‘strong permission’ does not mean that merely
eating an apple will guarantee that everything is ok, no matter what else
happens. If only permission could be that strong! Rather, the difference
between ‘weak’ and ‘strong’ here is the difference between a system in which
we have only obligation and its negation (in which everything that is not
forbidden is permitted), and a more articulated system in which some things
are permitted (A ( δ), some things are forbidden ((A ( δ)⊥ ), and some
things are neither permitted nor forbidden. If I explicitly give you permission
to eat an apple, and I explicitly forbid you to eat a pear, what about eating a
banana? Is it permitted or forbidden? Maybe yes, maybe no.
There is not much discussion of weak permission versus strong per-
mission in the linguistics literature, but at least Asher and Bonevac (2005)
conclude that free choice permission involves strong permission. Certainly
if we want to distinguish between explicit permission and the absence of
prohibition, then we need a logic that can express strong permission. Since I
have claimed that You may eat an apple or a pear crucially neither permits
nor forbids eating both an apple and a pear, we must use strong permission
here.
But what exactly does A ( δ assert, if not that eating an apple will
guarantee the good thing? The key is to consider when A ( δ will be true.
We will be in a situation in which A ( δ just in case eating an apple in
that situation is compatible (‘cotenable’ in the terminology of Relevant Logic)
with all obligations being fulfilled. There are two kinds of such situations:
situations in which eating an apple happens to be obligatory, in which case
we can only conform to obligations by eating the apple (after all, everything
that is obligatory is at least permitted); and situations in which we’re already
in compliance, but eating an apple is optional and does not disturb our happy
state. But if we are otherwise in compliance, and we decide to eat an apple
(A), and we decide to simultaneously kill the postman (K), the fact that apple
eating is permitted will not save us: because of the resource-sensitivity of
10:12
linear logic, in particular, the absence of Weakening, we can’t ignore the dead
postman. As a result, the combination of eating an apple and killing the
postman will land us in a situation that is far from ok: A, K, A ( δ 6` δ.
A fuller understanding of linear implication, and therefore of strong
permission, will emerge from the model theory developed in section 8.
One major expository advantage of the reduction strategy is that it enables
us to talk about permission without complicating the logic with inference
rules for and ♦. Note that we do not necessarily give up anything by omit-
ting the unary connectives: McNamara (2006) and Lokhorst (2006) show that
under appropriate additional assumptions, deontic reduction characterizes
all the theorems of standard deontic modal logics.
Not that replicating standard deontic logic should be our goal; after all,
standard deontic logic has A → ¬¬A as a tautology, which imposes a kind
of consistency on the set of deontic obligations. In the linguistics tradition,
a number of people (notably Kratzer (1991)) have argued that this is not
appropriate for describing natural language modality, and that we should
instead allow for inconsistent laws. However, I’m not aware of any reason
why deontic reduction is incompatible with Kratzer’s characterization of
deontic modality.
I should note that deontic reduction is not an innocent choice for the
empirical phenomena under consideration here. As I will explain shortly,
&
because linear implication is defined as A ( B ≡ A⊥ B, the formula for
which permission is granted (i.e., A) occurs in a downward-entailing position.
This will be crucial in deriving the desired entailments. For all I know,
however, it is possible that if a suitable notion of strong permission were
defined in a standard deontic framework (i.e., one based on unary operators
like ), similar entailments would go through.
I intend for deontic reduction to be a convenient expository choice, and
not an essential feature of a resource-sensitive approach to free choice
permission. Nevertheless, there may be some empirical support for the
naturalness of deontic reduction. After all, in addition to being able to use
a modal verb to express permission and obligation, English can also deploy
a conditional: It’s ok if you eat ‘You may eat’. In fact, in Japanese there is
no modal verb that expresses permission, and permission normally can only
be conveyed by means of a conditional construction (Clancy 1985, Akatsuka
1992): tabe-temo ii ‘eat-even.if good’, ‘It’s ok if you eat’.
10:13
Chris Barker
4 Free choice permission
We can now suppose that or has among its meanings ⊕, so that You may
eat an apple or⊕ a pear translates as (a ⊕ p) ( δ: the additive disjunction
of a and p is explicitly permitted. Then the desired free-choice implication
follows directly from simple linear reasoning. Generalizing slightly by using
variables over formulas (A, B) instead of atomic formulas (a, p), we have:
` A, A⊥ ` B, B ⊥
⊕1 ⊕2
` A ⊕ B, A⊥ ` δ⊥ , δ ` A ⊕ B, B ⊥ ` δ⊥ , δ
⊗ ⊗
` (A ⊕ B) ⊗ δ⊥ , A⊥ , δ & ` (A ⊕ B) ⊗ δ⊥ , B ⊥ , δ &
` (A ⊕ B) ⊗ δ⊥ , A⊥ ` (A ⊕ B) ⊗ δ⊥ , B ⊥
& &
δ δ
&
` (A ⊕ B) ⊗ δ⊥ , (A⊥ δ) & (B ⊥
& &
δ)
⊥2 , ≡
(A ⊕ B) ( δ ` (A ( δ) & (B ( δ)
This theorem is noted in Lokhorst 1997:6.1

What the speaker provides when she utters You may eat an apple or⊕ a
pear is justification for assuming either that eating an apple is permitted,
or that eating a pear is permitted. She is not providing enough resources
to prove both, so if her utterance is to provide the justification for action,
a choice must be made. However, since the resources allow proof of either
alternative, the consumer is free to choose whichever of the alternatives he
prefers. That is how the addressee has permission to eat an apple, or else
permission to eat a pear, but normally (and certainly not by virtue of the
utterance of (1a)) does not have permission to eat two pieces of fruit.
This result depends on only two assumptions: that or can express ad-
ditive disjunction, and that it is reasonable to represent strong permission
using the deontic reduction strategy. The assumption that or can express
additive disjunction is essential, and is the heart of the explanation offered
here. Deontic reduction is a well-established approach to deontic logic moti-
vated entirely independently of any concern with the free choice permission
problem. Whether it can be replaced with a modal system more familiar to
linguists (if desired) remains for future work.
1 Strictly speaking, since the inference rules given above in section 2.2 are written with a single
formula on the right-hand side, many of the steps given in this proof (for example, the ⊕1
inference) require shuffling extra formulas across the turnstile, applying the inference rule
of interest, then shuffling them all back.
10:14
It is worth emphasizing that the basic free choice meaning is purely

semantic, without requiring any silent pragmatically-triggered type shifting
operators (as in, e.g., Fox 2007), or other pragmatic enrichment.
5 Prohibition
The behavior of permission under negation plays an important role in recent

discussions. As mentioned above, Alonso-Ovalle (2006) and Fox (2007) argue
that the fact that free-choice implications seem to disappear under negation
shows that free choice implications are likely to be implicatures. Since I
am claiming that the relevant free choice implications are entailments, it is
important to carefully examine negated cases.
Whatever is not permitted is forbidden: just as in English, Lokhorst
renders (strong) prohibition as negated (strong) permission. Thus if (A (
δ)⊥ , then A is prohibited. (It is a well-known property of English that may
not is always construed with negation taking scope over may.)
(5) a. You may not eat this apple or this pear.

b. You may not eat this apple.
c. You may not eat this pear.
The main fact to be explained is that (5a) implies (perhaps entails) (5b) and
(5c). Unlike positive free choice implications, we can usually infer that (5b)
and (5c) hold simultaneously. That is, you cannot comply with (5a) by merely
refraining from eating apples. Apparently, permission is a scarce resource,
but prohibition is all too abundant. I will call this construal of (5a) the double-
prohibition reading, and I will suggest that it arises as a standard Gricean
implicature.
As with most stories about scalar implicatures, we will be concerned with
the epistemic state of the discourse participants.
(6) a. You may not eat this apple or this pear.

b. You may not eat this apple or you may not eat this pear.
c. ((A ⊕ B) ( δ)⊥ ` (A ( δ)⊥ ⊕ (B ( δ)⊥
The translation of (6a) entails the translation of (6b) (that is, (6c) is a theorem),
so we predict that (6a) ought to have an interpretation on which it guarantees
that (6b) is true. Such an interpretation is widely attested in the literature,
and usually is described as favoring the continuation . . . but I don’t know
10:15
Chris Barker
which. I’ll call this the ignorance reading.

Note, by the way, if a forgetful babysitter utters (6) to the child she is
babysitting, if the child behaves rationally, he will not eat either piece of fruit,
since he can’t be sure which action is safe — exactly the same behavior as if
both actions had been explicitly forbidden.
So far, so good. Next, consider a situation in which the speaker is not
ignorant. Exactly one of the alternatives is prohibited, and this time the
speaker knows which one it is. Let’s say that apple-eating is forbidden, but
pear eating is fine. If the speaker were being fully cooperative, then she
would normally choose to simply say (5b), and certainly would not choose to
say (5a). In Gricean terms, adding a superfluous disjunct would violate either
the maxim of Quantity, or the maxim of Manner, or both.
There are nevertheless situations in which this kind of uncooperative
statement might be used. For instance, if a father tells an older sister the
rules (“apples forbidden, pears ok”), she might later uncooperatively tell her
younger brother
(7) You may not eat this apple or this pear . . . but I won’t tell you which.
Once again, the rational course of action on the part of the younger sibling
will be to refrain from eating either piece of fruit. Presumably this is exactly
the outcome the unkind sister is aiming for. (I’m indebted to Sven Lauer for
this scenario; see also Simons 2005:273n.4.)
In both the ignorance scenario and the uncooperative scenario, at least
one of the disjuncts holds, but the choice of which fruit is prohibited belongs
to the master, not the slave. The subject of the prohibition must plan for the
worst, and therefore can’t safely commit to either alternative.
Finally, imagine that the speaker is neither ignorant nor uncooperative.
She may be an expert (perhaps she just received full instructions from the
parents) or she may be herself the source from which permission flows; in
any case, she is fully opinionated about what is forbidden. Crucially, although
(6) guarantees only one disjunct, it is consistent with situations in which
both disjuncts hold. As just argued, if exactly one disjunct held, the speaker
would simply have said so. We can deduce, therefore, that both disjuncts
must hold.
There is one more step to complete the Gricean explanation. If the speaker
intends to convey double prohibition, why not use and?
(8) You may not eat an apple and a pear.
10:16
Although this sentence may have the desired double-prohibition reading,

it certainly also has a reading on which it prohibits (only) complex events
that involve eating both an apple and a pear. Uttering (8), then, leaves in
play the possibility that eating a single piece of fruit may be permitted. The
speaker uses a weak form in (6) to express a stronger meaning in order to
avoid misinterpretation.
Thus the assumption that the speaker is opinionated and cooperative de-
rives the implicature that both disjuncts are prohibited via ordinary Gricean
reasoning, without the need to stipulate any special uniformity or distributiv-
ity axioms (as in Alonso-Ovalle 2006) or Zimmermann’s (2000:286) Authority
Principle.
6 Comparisons with other accounts
6.1 Implicature accounts
A number of authors, including Schulz (2005) and Fox (2007), suggest that
free choice implications are implicatures that arise in contexts in which the
speaker is opinionated about which options are permitted and which are not.
Fox (2007) reasons as follows: if a speaker utters a disjunction when she
could have made a stronger statement, this could naturally lead to a Quantity
implicature that she did not have sufficient evidence to assert the stronger
statement. If those ignorance implicatures are implausible, as when the
speaker is describing permissions in a situation in which their judgment is
authoritative, the implausibility can trigger a repair strategy under which the
disjunction is pragmatically enriched by the application of a predicate exh
(for “exhaustive”). For instance, if an authoritative speaker says You may eat
an apple or a pear, it may be implausible that she doesn’t know whether you
may eat an apple, or whether you may eat a pear. Therefore the statement
♦(A ∨ P ) can be strengthened (given a number of additional assumptions) to
an exhaustive meaning equivalent to the proposition ♦A ∧ ♦P ∧ ¬(♦(A ∧ P )).
This asserts that you may have an apple, and you may have a pear, but you
may not both have an apple and a pear.
I will discuss three potential problems with these accounts. The first
problem is that the free-choice reading can survive even in the presence of
manifest ignorance on the part of the speaker:
(9) I don’t know whether you may have an apple or a pear.
10:17
Chris Barker
Since exhaustivity is supposed to be triggered by contexts that are incompat-

ible with ignorance, (9) should only have a reading on which it means ‘I don’t
know whether you may have an apple or whether you may have a pear’. But
(9) robustly also has a free-choice reading on which it means ‘I don’t know
whether you may eat a piece of fruit, where the fruit is your choice between
an apple or a pear’.
(10) If it turns out that John may have an apple or a pear, he’ll choose the
pear.
Likewise, as Kamp (1978:279) notes, free choice interpretations remain avail-

able for the antecedent of a conditional, where it is far from clear how
assumptions about complete knowledge of the alternatives could enter in.
The second problem is that if free choice implications were implicatures,
we should expect them to be generally cancelable:
(11) You may eat an apple or a pear, although in fact you may not eat an
apple.
Probably (11) has a non-free choice reading on which it is at least logically

consistent. If this were the basic semantic meaning of (11), then we would
expect it to emerge whenever the free-choice implication is cancelled. The
puzzling thing is that if we assume the speaker is opinionated, (11) gives a
strong impression of contradiction rather than of a cancelled implicature.
Chemla (2009a, 2009b) proposes a pragmatic principle that he calls
symmetry, which says that the epistemic attitude of the speaker must be
uniform across disjuncts. Symmetry correctly predicts that (11) should
be infelicitous, since it implies that the speaker holds a different attitude
towards one disjunct than towards the other. However, symmetry alone
cannot explain why (11) sounds contradictory.
One possibility is that performativity is interfering. Portner (2009b)
suggests that performative uses (see section 7.2 below) force, or at least
strongly promote, a free choice interpretation. If so, then what (11) shows
is that at least when an utterance is performative, free choice implications
cannot be cancelled.
The third problem applies to Fox’s account, though not to Schulz’s: as
Fox himself notes, the proposed implicatures for the free-choice reading do
not match intuitions about the meanings of the sentences in question. Fox’s
exh-enhanced truth conditions assert that eating an apple is permitted, and
10:18
eating a pear is permitted, but eating an apple and a pear is forbidden. But
as Simons (2005) and others observe, free choice is compatible with joint
permission. For instance,
(12) [You may eat as much fruit as you want, so]

You may (certainly) eat an apple or a pear.
On Fox’s account, (12) should be contradictory on a free-choice reading of

the final clause. However, although (12) may be mildly redundant, there is no
hint of contradiction.
Franke (2009:8) and van Rooij (2010:18) derive results similar to Fox’s
by using a particular game-theoretic technique (“Iterated Best Response”) to
compute implicatures. One advantage of their approach is that the proposi-
tion that eating both an apple and a pear is forbidden arises as an implicature
only when certain alternatives are salient, correctly predicting that (12) need
not be a contradiction.
On the account here, of course, the explanation for the fact that (12) is
not a contradiction is particularly simple and direct: You may eat an apple or
a pear entails that you may eat an apple, and that you may eat a pear, but
refrains from saying anything about whether it’s ok to eat both an apple and
a pear. It neither grants permission to eat two pieces of fruit, nor forbids it.
Van Rooij frames the comparison between exhaustivity and game theory
as part of the debate about embedded implicatures: if free choice implications
can be handled using iterated best response, then free choice no longer
provides an argument that implicatures must be calculated locally (i.e., in
embedded contexts). The resource-sensitive approach here weakens the
argument that free choice motivates embedded implicatures even further, by
calling into question whether free choice implications are implicatures in the
first place.
6.2 Alternative set semantics
Zimmermann (2000) proposes that disjunction contributes a set of exhaustive

epistemic alternatives, so that You may eat an apple or you may eat a pear
expresses the claim that it is possible that you may eat an apple and it is
possible that you may eat a pear. Novel pragmatic principles (notably his
Authority Principle) strengthen this conjunction into an assertion that you
may eat an apple and you may eat a pear.
10:19
Chris Barker
Geurts (2005) elaborates on Zimmermann’s analysis, arguing that disjunc-

tive alternatives should not always be epistemic. Rather, disjunction “fuses”
with nearby modal operators, so that You may eat an apple or a pear means
that you may eat an apple and you may eat a pear without needing to invoke
any special pragmatic principle.
Neither Zimmermann’s nor Geurts’ analyses explain why the free-choice
or differs from an overt and (i.e., You may eat an apple and you may eat
a pear) in failing to guarantee that two pieces of fruit may be eaten. In
addition, as Geurts (2005:406) briefly discusses, it is not clear how either
analysis accounts for negated free choice (discussed above in section 5).
Zimmermann’s idea that disjunction introduces a set of alternatives has
been implemented in a variety of ways. I will mention three here.
Kratzer and Shimoyama (2002) propose that indefinites contribute a set
of alternatives, one for each way of resolving the indefinite. This requires in
turn a modification of the basic compositional semantics, since it is necessary
to allow for composition with sets of meanings instead of single meanings.
This is done pointwise using “Hamblin semantics”, so that an embedded
indefinite can give rise to a set of alternatives at higher compositional levels
(see Shan 2004 for discussion of the complexities of pointwise composition).
Alonso-Ovalle (2006) extends this strategy from indefinites to disjunction,
explicitly addressing the free choice problem.
Aloni’s (2007) approach manages disjunction-alternatives within a dy-
namic semantics based on Dekker 2002, supplemented with structured propo-
sitions.
Van Rooij (2008:309) sketches yet a third implementation, on which
alternatives are built into the definition of a minimal extension of a world.
Then a world in which you eat only an apple might qualify as a minimal
extension of the world we are in, but not a world in which you eat both an
apple and a pear. In order to deliver free choice implications, it is necessary
for the propositions expressed by a disjunction to always be among those
used for articulating minimal extensions, though this requirement is not
guaranteed by the formal analysis.
In these approaches, free choice effects arise when certain operators
explicitly manipulate alternative sets. For instance, Aloni stipulates that
may(Φ) is true (where Φ is a set of alternatives) just in case the ordinary
meaning of may is true of each alternative. Thus You may eat an apple or a
pear involves applying may to the set of alternatives corresponding to the
addressee eating an apple and the addressee eating a pear. The sentence will
10:20
be true, then, just in case You may eat an apple is true and You may eat a
pear is true.
The account here resembles Aloni’s alternatives account in two important
respects. First, free choice implications are entailments rather than implica-
tures. As we saw in section 6.1, the fact that free choice implications do not
always seem to be cancelable argues in favor of theories on which they are
treated as entailments.
Second, because alternative-taking may requires that ordinary may must
be true of every alternative, it is a downward-entailing operator with re-
spect to the disjunction that gives rise to the alternatives. Aloni points out
that this explains why (so-called free choice) any is licensed (e.g., You may
eat anything), and since the antecedent of linear implication is likewise a
downward-entailing position (as noted above), the same explanation carries
over here. (Of course, there is more to free choice than placing an indefinite
in a downward entailing context. For instance, a referee observes that in
some Romance languages, some free-choice indefinites are licensed under
permission, but not in the antecedent of conditionals or in other downward
entailing contexts.)
One important difference between the approach here and alternative-
based analysis, including Aloni’s, is the integration with the larger compo-
sitional system. The alternative-set approach in effect creates unbounded
dependencies in the semantics: or introduces alternatives which the compo-
sitional system must track until an alternative-aware operator collapses the
alternatives back into to a single proposition. The account here adjusts only
the denotations of the logical connectives, leaving the compositional system
entirely undisturbed. (Not that I had provided a compositional analysis,
though I trust that appropriate details can easily be supplied.)
7 Issues
7.1 Free choice effects apart from permission
It is widely assumed that whatever explains free choice implications for

deontic modals should be the same thing that explains the similar behavior
of epistemic modals:
(13) a. John might be in Aarhaus or in Boston.

b. John might be in Aarhaus.
c. John might be in Boston.
10:21
Chris Barker
In parallel with the permission cases, the disjunction in (13a) entails (13b) and
(13c).
The simplest way to extend the account here to epistemic cases would
be to add to our logic a new atomic formula , which is true just in case
everything that is epistemically known holds. Then You might be in Aarhaus
would translate as A ( , and the desired entailments follow as a matter of
logic.
Adding an epsilon to the logic is more than a superficial change. It is im-
portant to keep track of what the logic claims to be modeling. Classical logic
promises to preserve truth: if the assumptions are true, the conclusion will
be true. Since truth is not resource sensitive (if something is true once, it is
true again and again), that is why it is legitimate to duplicate and discard as-
sumptions. Linear Logic promises to preserve resources: whatever resources
the assumptions provide, that is exactly what resources will appear in the
conclusion. In our deontic application, the critical resource is permission:
if the assumptions provide enough permission to eat exactly one piece of
fruit, then the conclusion will provide the same amount of permission. In
the epistemic case, the critical resource is epistemic commitment: whatever
commitments are made by the assumptions, the conclusion will make exactly
the same commitments.
There are other important differences between deontic logic and epistemic
logic. For instance, it is generally considered desirable for an epistemic logic
to guarantee that if you know that A is true, then A is true (A ` A). But
deontically, you would not want to conclude from the fact that A is obligatory
that A must hold, since obligations are all too often not fulfilled. More
relevantly, there are empirical dis-analogies between the free choice behavior
of deontic uses of modals versus epistemic modals. For instance, Kamp
(1978), Zimmermann (2000), and Aloni (2007) note that it is significantly more
difficult to construe epistemic modals as having a . . . but I don’t know which
interpretation (though it is still possible — see especially Simons 2005:274).
I’m not aware of any reason why a reduction strategy could not be part
of a more complete analysis of epistemic modality; nevertheless, it would
be prudent to be cautious about assuming that any deontic analysis should
automatically extend to epistemic cases.
In addition to the possibility that free choice effects may occur in other
modalities, Fox (2007) argues that free choice effects can be discerned in
non-modal contexts that involve existential quantifiers.
10:22
(14) There’s beer in the fridge or in the cooler out back.
Especially when (14) is heard as an implicit permissive, (14) entails both that
there is beer in the fridge and that there is beer in the cooler out back. Both
alternatives are guaranteed to be true, and the consumer of the information
has free choice of which one is relevant for forming a plan of action.
Klinedinst (2007) suggests that free choice effects are present with some
existential quantifiers, but only when the quantificational DP is plural:
(15) a. Some passengers got sick or had difficulty breathing.

b. A passenger got sick or had difficulty breathing.
In (15a), there is a reading on which some passengers got sick, and some had
difficulty breathing. On such a reading, at least some of the passengers must
have gotten sick, and at least some of the passengers must have had difficulty
breathing. But in (15b), there is no guarantee that both of the properties must
be instantiated.
Having mentioned these facts, I will not attempt a discussion here of the
interaction of free choice with quantifiers or with plurals. See Chemla 2009a
for experimental evidence and relevant discussion.
7.2 Performativity
Kamp (1978) draws a distinction between granting permission versus describ-

ing permission, where granting permission is a performative action. When
a parent says You may eat an apple or a pear in the right circumstances,
fruit-eating options may come into being that were not present before the
utterance. But when a sibling comments later Apparently, you may eat an
apple or a pear, they are merely describing the current situation, and no
new options come into being. Van Rooij (2008) and Portner (2009b) de-
velop a dynamic semantics for permission on which a permission sentence
performatively changes the set of what is allowed.
One of the main arguments that performativity is important relies on
correlations between performative uses and the availability of free choice
interpretations. Certainly descriptive uses (such the sibling’s comment) can
have a free choice interpretation or not. Performatives, however, strongly pre-
fer a free choice interpretation. Yet it may still be possible for a performative
to have a non-free choice interpretation:
10:23
Chris Barker
(16) You may pillage city X or city Y. But first take counsel with my secre-
tary.
Kamp (1973:67; see also Kamp 1978:279) says of this example that “[t]he
second part of this statement makes it clear that the vassal should not infer
from the first part that he may make his own choice of city. Which one he may
loot ultimately depends on the secretary’s advice, the tenor of which — we
may assume — is at this point unknown to king and vassal alike.” To be
sure, nothing specific has been permitted, and the vassal cannot form a
complete plan of action. If we conceive of a performative as something that
enlarges what an agent may safely do, we might therefore suppose that (16) is
a merely descriptive use, since it does not by itself allow the vassal to act. Yet
something must have been permitted: where does the disjunctive permission
that the sentence describes come from, if not from the performance of (16)?
As far as the current paper is concerned, it is enough for permission
sentences to characterize what is allowed. Then whether an utterance ex-
pands the sphere of permissibility depends on the interaction of the truth
conditions with the normal range of factors that influence how a discourse
participant decides to react to an utterance. Whether this minimalist strategy
is viable, or whether it will ultimately be necessary to provide a special role
for performativity remains to be seen. (See Kamp 1978 for extensive, but
ultimately inconclusive, discussion.)
7.3 Is there a conjunctive use of or after all?
Geurts (2005) and Simons (2005) emphasize the importance of explaining

how free choice implications arise when or takes scope over the permission
modal.
(17) a. You may eat an apple or a pear.

b. You may eat an apple or you may eat a pear.
The account of free choice given so far does not explain why (17b) also has a
free choice interpretation.
Simons proposes an across-the-board LF movement operation on which
the sentence with unembedded or is predicted to be logically equivalent to
You may [eat an apple or eat a pear]. That approach is compatible with the
account of free choice here.
10:24
However, there is an alternative explanation that may be worth some

consideration: perhaps resource-sensitive or is ambiguous between ⊕ (the
translation we’ve given it so far) and &.
After all, there is no other lexical item that is a candidate for expressing &.
For instance, as mentioned above, if you have ingredients for either meringue
or angel food cake, but only enough to make one recipe, and someone asks
‘What’s for dessert?’, the answer is meringue or& cake, never meringue and&
cake.
A second intriguing clue comes from conditionals. In Linear Logic,
strengthening of the antecedent is valid for & but not for ⊗. That is, we
have A ( C ` (A & B) ( C but A ( C 6` (A ⊗ B) ( C. The observation
that and never expresses & explains why trying to strengthen an antecedent
using and in English does not work: If John left, we could all play bridge does
not entail If John left and Mary left, we could all play bridge. But if or has a
conjunctive use, then we could explain why the inference does seem valid if
we use or: If John left or& Mary left, we could all play bridge.
If or can express &, then the ability of (17b) to serve as a paraphrase of
(17a) is immediately explained: it translates directly as (A ( δ) & (B ( δ),
and it is easy to prove that (A ( δ) & (B ( δ) ` (A ⊕ B) ( δ.
Of course, if or had such a conjunctive use, we would expect it to occur
in embedded position too, for example, You may eat an apple or& a pear. But
this is harmless, and merely gives a different route to the . . . but I don’t know
which reading, which we derived above by giving (disjunctive) or wide scope.
More problematically, we would also expect a conjunctive or to be avail-
able in non-modal sentences. Then saying that John left or& Mary left would
offer the addressee free choice of which disjunct to believe, yet would license
belief in at most one of the disjuncts. Such a meaning does not appear to be
available.
Put another way, non-modal uses of or appear to always be classical
disjunction (this is hardly surprising). One notable feature of Linear Logic
is that the classical connectives are easily expressible, given the addition of
the so-called exponential operators, ! (pronounced ‘of course’) and ? (‘why
not?’): from ∆ ` !A infer ∆ ` A, !A); from ?A ` ∆ infer ` ∆. These operators
allow a richer control over resources in which assumptions can be used
repeatedly, as in contraction, or ignored, as in weakening. Given Linear
Logic with exponentials, we can choose a more relaxed classical resource
management regime, or a more fussy pure Linear Logic regime, as needed.
For instance, the classical disjunction of A and B can be expressed as !A ⊕ !B.
10:25
Chris Barker
So there is no problem allowing Linear reasoning to peacefully coexist

with classical reasoning, as long as we can reliably tell which kind of resource
management to use in any given context. To a first approximation in English,
linear resource management appears to be relevant only for untensed clauses
with bare verb forms, as in You may eat an apple or eat a pear, in which or
takes scope over the untensed bare verb phrases eat an apple and eat a pear.
Then we could suppose the reason that John left or Mary left does not have
a conjunctive interpretation is because the tensed clauses trigger (only) a
classical interpretation of or.
Figuring out how to regulate the distribution of an ambiguous or would
be a major undertaking, so I leave this issue unresolved for now.
8 Semantics for linear logic
The discussion so far has been conducted entirely in terms of inference rules
and proofs. It is unusual these days, though not unheard of, to express
the meaning of natural language using proof theory without giving a model
theory. More often, of course, we have the opposite situation, in which
semantic analyses provide models without any proof theory.
The most complete picture, however, emerges when proof theory and
model theory complement each other. Therefore I will discuss models for
Linear Logic here, with a detailed illustration of a free choice example.
There are a number of semantic approaches to Linear Logic. Girard’s
(1987, 1995) original semantics in terms of coherence spaces and in terms of
phase spaces would not be directly helpful here. There are other semantic
approaches, however, that have tantalizing associations with the granting
and denying of permission. I will mention three. First, Petri nets describe
the movement of tokens through a network. Lokhorst (1997) uses Petri nets
as models of his Linear Logic treatment of deontic reasoning. (Think of the
tokens as lumps of permission moving from one location to another.) Second,
in game semantics a Proponent and an Opponent take turns making choices,
and I have argued that tracking choice is central to understanding permission
talk. See, e.g., Accorsi and van Benthem 1999 for a discussion of game
semantics for Linear Logic. Third, there are computational models of Linear
Logic that make an explicit connection between the additives and choice. For
example, Abramsky’s (1993) computational semantics for intuitionistic Linear
Logic interprets A ⊗ B as an ordered pair hA, Bi both of whose elements
will be used in further computation (eager evaluation); A & B, on the other
10:26
hand, denotes an ordered pair only one of whose elements will ever be used
(lazy evaluation), and of course A ⊕ B delivers a projection function that
chooses one or the other of the elements in a & pair. Unfortunately for our
purposes here, Abramsky’s computational interpretation of classical Linear
Logic involves parallel distributed processing, which would take us too far
afield.2
Most reassuringly familiar for linguists, Allwein and Dunn (1993) provide
a kosher Kripke-style possible worlds semantics, and that is the approach
that I will present here.
Following Allwein and Dunn, the expository strategy will be to begin with
an algebraic model that is faithful to the inference rules, then show how to
reconstruct that algebra in terms of worlds.
8.1 An algebraic semantics
The algebraic model contains three main components: a lattice for modeling
the additive connectives, a unary operation for modeling negation, and a
binary operation for modeling the multiplicative connectives.
Additives: let A, ∧, and ∨ form a bounded lattice with partial order ≤ and
top and bottom elements. The lattice can be finite or non-finite, and it can be
distributive or non-distributive.
Negation: now let ∼ be a DeMorgan negation on that lattice. This means
that ∼ must be order-reversing (for all x, y in A, x ≤ ∼y iff y ≤ ∼x), and it
must be involutive (for all x in A, ∼∼x ≤ x).
Multiplicatives: we add a commutative, associative binary operation ◦
with identity element t (that is, t ◦ a = a = a ◦ t for all a in A). Thus A,◦, and
t form a commutative monoid. Note that t may be distinct from the top of
the lattice. The monoid operation must distribute over the join operation,
that is, for all a, b, c ∈ A : a ◦ (b ∨ c) = (a ◦ b) ∨ (a ◦ c). It must also be
compatible with negation in the sense that for all a, b ∈ A : a ◦ b ≤ c iff
a ◦ ∼c ≤ ∼b (“antilogism”).
2 Though it is intriguing to think that the meaning of some natural language expressions might
be appropriately modeled by a distributed process. Perhaps some permission sentences
denote programs which the recipient can execute in various environments in order to
produce whichever certificate of permission is required. Then a free choice permission
sentence denotes a program whose execution is blocked until it receives an external choice
(a selection of which alternative to deploy).
10:27
Chris Barker
The points in the lattice model formulas. Given a valuation v mapping

atomic formulas onto elements of A, we extend v to complex formulas as
follows: v(A⊥ ) = ∼v(A); v(A & B) = v(A) ∧ v(B); v(A ⊕ B) = v(A) ∨ v(B);
&
v(A ⊗ B) = v(A) ◦ v(B); v(A B) = ∼(∼v(A) ◦ ∼v(B)); and v(A ( B) =
∼(v(A) ◦ ∼v(B)).
As an example, I will present a six-element, non-distributive lattice:
5 ∼ ◦ 0 1 2 3 4 5
0 5 0 0 0 0 0 0 0
3 4
1 3 1 0 1 2 1 2 5
2 4 2 0 2 1 2 1 5
1 2 3 1 3 0 1 2 3 4 5
4 2 4 0 2 1 4 3 5
0 5 0 5 0 5 5 5 5 5
The Hasse diagram on the left gives the lattice order in the usual way, so that
0 ≤ 1, 1 ≤ 3, and so on. In addition, since ≤ is reflexive and transitive, we
also have 0 ≤ 0, 0 ≤ 3, etc.
Since meet (∧) in a lattice is the unique greatest lower bound, it can be
read off the Hasse diagram, e.g., 5 ∧ 5 = 5, 4 ∧ 5 = 4, 4 ∧ 3 = 0, and so on
(dually for the join operation ∨).
It is easy to see by inspection that the negation relation ∼ is involutive
(e.g., ∼∼3 = 3) and order reversing (e.g., along with 0 ≤ ∼3 we have 3 ≤ ∼0).
Note that 3 serves as the identity element t of the monoid. Since the
monoid operation is commutative, the matrix is symmetric across the top-left
to bottom-right diagonal (e.g., 4◦2 = 2◦4). Furthermore, mechanical checking
will confirm that the monoid operation is associative (e.g., (4 ◦ 2) ◦ 1 = 4 ◦ (2 ◦
1)), that it distributes over the join operation (e.g., 3◦(1∨4) = (3◦1)∨(3◦4)),
and that it respects the antilogism requirement (e.g., 4 ◦ 2 ≤ 3 ≡ 4 ◦ ∼3 ≤ ∼2).
A sequent Γ semantically entails ∆ (written ‘Γ î ∆’) just in case the
valuation of the multiplicative conjunction of the formulas in Γ is dominated
by the valuation of the multiplicative disjunction of the formulas in ∆. For
instance, since x ∧ y ≤ x for all x, y in A by the definition of meet in a
lattice, we have that A & B î A.
To illustrate how these tables provide a model of the logic, recall that we
have the following three theorems discussed in previous sections and one
non-theorem:
10:28
(18) a. (A ( δ) & (B ( δ) ` (A ⊕ B) ( δ
b. (A ⊕ B) ( δ ` (A ( δ) & (B ( δ)
c. (A ( δ) ⊕ (B ( δ) ` (A & B) ( δ
d. (A & B) ( δ 6` (A ( δ) ⊕ (B ( δ)
If the given algebra is a faithful model of Linear Logic, we expect that for
every valuation v assigning a lattice element to the propositional symbols
δ, A, and B, the valuation of the left hand side of any theorem will be
dominated (in the sense of the lattice order ≤) by the valuation of the right
hand side. This is the case for (18) (a) through (c), but we have a countermodel
for (18d): if v(δ) = 0, v(A) = 1, and v(B) = 2, then v((A & B) ( δ) =
v(((A & B) ⊗ δ⊥ )⊥ ) = ∼((v(A) ∧ v(B)) ◦ ∼v(δ)) = ∼((1 ∧ 2) ◦ ∼0) = 5. But
v((A ( δ) ⊕ (B ( δ)) = 0, and 5 6≤ 0.
There are (infinitely) many other possible choices for a lattice, and for
any given lattice, there may be many choices for a suitable negation and for
a suitable monoid operation. For instance, Restall (2000:170) gives an even
simper (but still instructive) model of (distributive) Linear Logic based on a
four-element lattice. Since Linear Logic is sound and complete with respect
to the class of algebraic models given here, a sequent is a theorem iff its left
hand side semantically entails its right hand side for every valuation in every
model.
8.2 A possible-worlds semantics
The algebraic semantics is simple and straightforward, in part because it

merely recapitulates the inference rules; for the same reason, it may not
add any insight beyond what is already evident from the inference rules
themselves. Constructing a Kripke-style possible worlds semantics is a bit
more complicated, but may allow natural language semanticists to transfer
some of their intuitions from more familiar sorts of semantics for natural
languages. We shall see that one particularly intriguing feature of the Kripke
semantics for Linear Logic is that there will be three possibilities for the
status of a formula at a world: it may be true, false, or neither true nor false,
which is exactly what makes Linear Logic suitable for modeling actions that
may be permitted, forbidden, or neither permitted nor forbidden.
Allwein and Dunn associate each element in A with a particular set of
worlds. The construction goes as follows. Consider pairs of the form hF , Ii,
where F and I are sets of points in the lattice. We require hF , Ii to satisfy the
10:29
Chris Barker
following four requirements: first,

(w1): F and I must be disjoint.
Second,
(w2): F must be closed upward under ≤, so that for all a ∈ F and for all
b ∈ A : (a ≤ b) implies b ∈ F . Dually, I must be closed downward under ≤,
so that for all a ∈ A and for all b ∈ I : (a ≤ b) implies a ∈ I. In particular, F
always contains the top element, and I always contains the bottom element
of the lattice.
Third, F and I must be closed under meets and joins, respectively. That
is:
(w3): for all a, b ∈ F : a ∧ b ∈ F ; and for all a, b ∈ I : a ∨ b ∈ I.
In other words, conditions (w2) and (w3) require that F must be a filter,
and that I must be an ideal.
Finally, there is a maximality condition:
Maximality: A filter/ideal pair hF , Ii satisfying (w1), (w2), and (w3) satisfies
maximality only if there is no other distinct pair of sets hF 0 , I 0 i also satisfying
(w1), (w2) and (w3) that properly includes the first, i.e., such that F ⊆ F 0 and
I ⊆ I 0.
Here are a few of the possible pairs of subsets that fail to satisfy the
requirements:
h{1, 2}, {1, 3}i violates w1
h{3}, {0}i violates w2
h{4, 3, 5}, {0}i violates w3
h{4, 5}, {0}i violates Maximality
In fact, in this model there are exactly four maximal disjoint filter/ideal pairs:
World a: h{4, 5}, {0, 2}i
World b: h{3, 5}, {0, 1}i
World c: h{2, 4, 5}, {0, 1, 3}i
World d: h{1, 3, 5}, {0, 2, 4}i
These pairs will stand in one to one correspondence with our possible worlds.
For each world w = hF , Ii, we will interpret F as the set of points that are
true at w, and I as the set of points that are false at w.
For worlds c and d, every point in the lattice is either true or false. But
for world a, points 1 and 3 are neither true nor false. Similarly, for world
b, points 2 and 4 are neither true nor false. In terms of permission talk,
there may be situations in which some things are permitted, some things are
forbidden, and some things are neither permitted nor forbidden.
10:30
The next step is to associate each point in the lattice with a set of worlds.
If w is a world associated with the pair of sets of points hF , Ii, let w1 indicate
F and w2 indicate I. Then we can define a map β that takes each point p in
the lattice onto the set of worlds w such that p ∈ w1 :
β(0) = {}
β(1) = {d}
β(2) = {c}
β(3) = {b, d}
β(4) = {a, c}
β(5) = {a, b, c, d}
In other words, we map each point in the lattice to the set of worlds that
make it true.
We now need to define relations over sets of worlds that will allow us to
reconstruct the logical operations we want to model: ∧, ∨, ∼, and ◦.
The meet operation is straightforward. We extend β in the following way:
β(p ∧ q) = β(p) ∩ β(q). So meet corresponds to simple set intersection.
Thus 4 ∧ 2 = 2, and β(4 ∧ 2) = β(4) ∩ β(2) = {a, c} ∩ {c} = {c} = β(2).
The join operation is not quite so straightforward. We cannot represent
join as set union. To see why, note that 3 ∨ 2 = 5, but β(3) ∪ β(2) =
{b, d} ∪ {c} = {b, c, d} 6= β(5). The solution is to exploit the information
present in the second element in the pair of sets that define the worlds. To
do this, we define two operations on sets of worlds. Let W be our set of
worlds, and let C be any subset of W :
l(C) = {x|for all y ∈ W , x1 ⊆ y1 implies y 6∈ C}
r (C) = {x|for all y ∈ W , x2 ⊆ y2 implies y 6∈ C}
Although l and r are defined over all subsets of W , we will only need to apply
them in the following cases:
r (β(0)) = r ({}) = {a, b, c, d}
r (β(1)) = r ({d}) = {b, c}
r (β(2)) = r ({c}) = {a, d}
r (β(3)) = r ({b, d}) = {c}
r (β(4)) = r ({a, c}) = {d}
r (β(5)) = r ({a, b, c, d}) = {}
For instance, the reason a is not in r (β(1)) is because a2 ⊆ d2 , but d ∈ β(1).
Allwein and Dunn show that for all points p in the lattice, l(r (β(p))) = β(p).
10:31
Chris Barker
We can now define join by shifting the conjuncts using r , then taking their
intersection, then shifting back using l: β(p ∨ q) = l(r (β(p)) ∩ r (β(q))). For
instance, we have β(1 ∨ 3) = l(r (β(1)) ∩ r (β(3)) = l({b, c} ∩ {c}) = l({c}) =
β(3). Trying the problematic case given above, β(3 ∨ 2) = l(r (β(3)) ∩
r (β(2))) = l({c} ∩ {a, d}) = l({}) = {a, b, c, d} = β(5), as desired.
At this point, β, l, and r allow us to fully simulate the structure of the
lattice in terms of sets of worlds.
Representing negation: β(∼p) = {x|h∼x2 , ∼x1 i ∈ r (β(p))} (where apply-
ing ∼ to a set of points returns the set resulting from applying ∼ to each mem-
ber of the original set). For instance, we have β(∼1) = {h∼{0, 1}, ∼{3, 5}i,
h∼{0, 1, 3}, ∼{2, 4, 5}i} = {b, d} = β(3).
Note that linear negation expresses something about provability, not
about falsity. One way to see this is to observe that in this model, 3 and its
negation ∼3 = 1 are both true at world d.
Representing the tensor relation ◦ proceeds in two steps. In the usual
Kripke semantics, unary modal operators are characterized by an accessibility
relation, a two-place relation over worlds. Because the multiplicatives are
two-place connectives, we will need a three-place relation.3
Sxyz iff ∀p, q : (p ◦ q ∈ z2 and q ∈ y1 ) implies p ∈ x2
The strategy here is a generalization of the Routley-Meyer semantics for

Relevant Logic. The goal is for the relation S to capture all of the information
present in the monoid operation ◦. In order to do this, it needs to take ad-
vantage of both sets of points that define the worlds: the set of propositions
that are true at a world as well as those that are false at that world.
Conceptually, S models modus ponens, in which x plays the role of
antecedent, y plays the role of the implication, and z plays the role of the
consequent. If the implication is true at y, and the consequent is false at z, S
guarantees that the antecedent must be false at x. For instance, since 3 (role:
the implication) is true at b and 1 ◦ 3 (the consequent) is false at c, but 1
(the antecedent) is not false at a, S does not hold of a, b, and c. We do have
Saba, however. The complete relation is aab, aba, baa, bbb, caa, cad,
cbb, cbc, cca, ccd, cdb, cdc, dab, dac, dba, dbd, dcb, dcc, dda, ddd.
Once we have constructed S as a function of ◦, we can define multiplicative
3 Lambek grammars (e.g., Moortgat 1997) also use a three place relation to give a semantics
for a multiplicative conjunction, where the conjunction is used to model concatenation of
linguistic expressions. For an example of modus ponens in type-logical grammar, DP ⊗
DP \S ` S.
10:32
conjunction purely in terms of relations over worlds:
β(p ◦ q) = l({z|∀x, y : Sxyz and y ∈ β(q) implies x ∈ r (β(p))})
This definition unpacks S in order to reconstruct the original relation ◦.
8.3 Understanding linear implication
What does the multiplicative conjunction of two formulas mean? Since we

now have both an algebraic and a possible worlds semantics in correspon-
dence, we can move back and forth between the two semantics in search of
insight.
Begin with the algebra. We can keep track of the state of our reasoning
process by picking out a point in the lattice. Assume that I have good
reason to believe we are located at lattice position 1. This is a highly specific
situation: I know that we are located on world d, since that is the only world
at which 1 is true.
Now assume that I learn something: that you have eaten a pear. Call this
fact B, and associate it with lattice point 4 (i.e., let v(B) = 4). To find out
where we are now, I compute 1 ◦ 4 = 2. Since β(1 ◦ 4) = β(2) = {c}, we are
now on world c. Learning that you have eaten an apple changes our location
from world d to world c.
This may initially seem somewhat distressing. In the usual Stalnakerian
system, adding information is typically a monotonic process of eliminating
possible worlds. If we’ve already narrowed the set of live options to a single
world d, there is no way to end up on a distinct world c. Because ◦ is non-
monotonic in this sense, it may be better to think of what we have been
calling worlds as classes of worlds. Sometimes the term ‘set-up’ is used
instead of ‘world’. I will use the term ‘situation’. Then learning that you have
eaten an pear changes the current situation into a different situation, one in
which the consequences of having eaten a pear obtain.
Let’s continue to reason. We pick a point in the lattice to serve as A, the
situation in which you eat an apple, and a separate point to serve as δ, the
situation in which all obligations are fulfilled. Say that v(A) = 2, v(δ) = 3,
and v(B) is still 4. Now consider the proposition that eating an apple is
permitted: A ( δ. Then v(A ( δ) = v((A ⊗ δ⊥ )⊥ ) = ∼(v(A) ◦ ∼v(δ)) =
∼(2 ◦ ∼3) = ∼(2 ◦ 1) = ∼2 = 4. Apparently, in this model, the situation
in which you eat a pear is modeled by the same situation in which you are
permitted to eat an apple. (This sort of coincidence is unavoidable in such a
10:33
Chris Barker
tiny model, in the same way that a valuation for classical logic will be forced
to map very different formulas to the same truth value.)
So let’s say that I know we’re in a situation in which you are permitted
to eat an apple (say, point 4), and then I learn that you have eaten an apple.
Perhaps I watch you eat it. This changes things: I compute 4 ◦ 2 = 1.
Thanks to your eating an apple, we’re now in situation 1. And since 1 ≤ 3,
things are as they are supposed to be. In terms of worlds, δ is modeled by
worlds (situations) b and d; and since point 1 corresponds to (a singleton set
containing only) world d, we must be in a δ-world.
So, what if you are permitted to eat an apple or a pear? That’s ∼((2 ∨ 4) ◦
∼3) = 4. We just saw that if we start at 4 and you an apple, we land on a
δ-world. And indeed, if we’re at point 4 and you eat a pear instead, 4 ◦ 4 = 3,
and once again we’re in a δ-situation.
But what if you eat an apple and you eat a pear? 4 ◦ 4 ◦ 2 = 2. Situation
2 is not a δ situation, so things are not ok. Having permission to eat an
apple or a pear is not the same thing as having permission to eat an apple
and a pear. Likewise, if killing the postman is modeled by situation 4 (i.e.,
v(K) = 4), then eating an apple and killing the postman will definitely not
leave us in a δ-situation. (This small model is somewhat unrealistic, however,
in that there are situations in which eating an apple, killing the postman, and
then eating another apple is perfectly permissible.)
However, as emphasized above, having permission to eat an apple or a
pear is compatible with also having permission to eat both. Making use of
the same model, if we have v(A) = v(δ) = v(B) = 3, then v((A & B) ( δ) =
v((A ⊗ B) ( δ) = 3. With this valuation, eating apples and pears is truly
optional: you can eat an apple and stop, or you can eat a pear and stop, or
you can eat an apple and you can eat a pear, and in all three cases you’ll end
up in a δ-situation.
9 Conclusions
On the view presented here, understanding free choice hinges on recognizing

that permission is a scarce resource, and so permission talk requires a
resource-sensitive semantics. Following Lokhorst (1997), I propose Linear
Logic as a way of tracking permission: not only what kind of permission has
been granted, but how much. Then primary free choice implications (given
You may eat an apple or a pear, infer You may eat an apple & You may eat a
pear) follow merely from expressing permission using the (independently-
10:34
motivated) Anderson/Kanger deontic reduction strategy. Double prohibition

(from You may not eat an apple or a pear infer You may not eat an apple and
You may not eat a pear) follows from standard Gricean reasoning, without
any need to postulate special pragmatic mechanisms.
The implications of this view are fairly dramatic. The claim is that natural
language expressions can differ in the resource management schemes they
impose. At the least, alethic modes impose classical resource management,
and deontic modes impose linear resource management (and quite likely,
other modes as well).
Linear Logic is one of the better known resource-sensitive logics. Other
logics may be worth considering instead. Similarly, the Anderson/Kanger
deontic reduction strategy was adopted in part for ease of exposition, and
work remains to integrate the account here within a more general framework
of modality in natural language. But apart from the advantages of Linear
Logic specifically or the deontic reduction, I would like to suggest a more
general conclusion: that we may be able to gain new and valuable insights into
long-standing puzzles in natural language semantics if we allow ourselves to
consider richer logical approaches than standard classical logic.
References
Abramsky, Samson. 1993. Computational interpretations of Linear Logic. The-
oretical Computer Science 111(1–2). 3–57. doi:10.1016/0304-3975(93)90181-
R.
Accorsi, Rafael & Johan van Benthem. 1999. Lorenzen’s games and Linear
Logic. University of Amsterdam manuscript. http://www.informatik.
uni-freiburg.de/~accorsi/papers/games.pdf.
Akatsuka, Noriko. 1992. Japanese modals are conditionals. In Diane Brentari,
Gary Larson & Lynn MacLeod (eds.). The joy of grammar: A festschrift
in honor of James D. McCawley. Amsterdam: John Benjamins. 1–10.
Allwein, Gerard & J. Michael Dunn. 1993. Kripke models for Linear Logic. The
Journal of Symbolic Logic 58(2). 514–545. doi:10.2307/2275217.
Aloni, Maria. 2007. Free choice, modals, and imperatives. Natural Language
Semantics 15(1). 65–94. doi:10.1007/s11050-007-9010-2.
Alonso-Ovalle, Luis. 2006. Disjunction in alternative semantics. UMass
Amherst: PhD dissertation.
Asher, Nicholas & Daniel Bonevac. 2005. Free choice permission is strong
permission. Synthese 145(3). 303-323. doi:10.1007/s11229-005-6196-z.
10:35
Chris Barker
Brown, Mark. 1996. Doing as we ought: Towards a logic of simply dis-

chargeable obligations. In Mark Brown & José Carmo (eds.). Deontic
logic, agency and normative systems (Third International Workshop on
Deontic Logic in Computer Science). Berlin: Springer. 47–65.
Chemla, Emmanuel. 2009a. Universal implicatures and free choice effects: ex-
perimental data. Semantics and Pragmatics 2(2). 1-33. doi:10.3765/sp.2.2.
Chemla, Emmanuel. 2009b. Similarity: Towards a unified account of
scalar implicatures, free choice permission and presupposition pro-
jection. Manuscript. http://www.emmanuel.chemla.free.fr/Material/
Chemla-SIandPres.pdf.
Clancy, Patricia M. 1985. The acquisition of Japanese. In Dan Slobin (ed.).
The crosslinguistic study of language acquisition: The data (Volume 1).
Hillsdale, NJ: Lawrence Erlbaum Associates. 373–524.
Dalrymple, Mary. 2001. Lexical Functional Grammar (Syntax and Semantics
volume 34). New York: Academic Press.
Dekker, Paul. 2002. Meaning and use of indefinite expressions. Journal of
Logic, Language and Information 11(2). 141–194. doi:10.1023/A:1017575313451.
Fox, Danny. 2007. Free choice disjunction and the theory of scalar impli-
cature. In Uli Sauerland and Penka Stateva (eds.). Presupposition and
implicature in compositional semantics. New York: Palgrave Macmillan.
71–120.
Franke, Michael. 2009. Free choice from iterated best response. In Maria
Aloni, Harald Bastiaanse, Tikitu de Jager, Peter van Ormondt & Katrin
Schulz (eds.). Pre-proceedings of the seventeenth Amsterdam Collo-
quium. Amsterdam: ILLC/Department of Philosophy. 267–276.
Geurts, Bart. 2005. Entertaining alternatives: disjunctions as modals. Natural
Girard, Jean-Yves. 1987. Linear Logic. Theoretical Computer Science 50(1).
1–102. doi:10.1016/0304-3975(87)90045-4.
Girard, Jean-Yves. 1995. Linear Logic: its syntax and semantics. In Jean-Yves
Girard, Yves Lafont & Laurent Regnier (eds.). Advances in Linear Logic.
Lecture Note Series 222. Cambridge, UK: Cambridge University Press.
1–42.
Hansen, Jörg, Gabriella Pigozzi & Leendert van der Torre. 2007. Ten philo-
sophical problems in deontic logic. Dagstuhl Seminar Proceedings
07122. http://drops.dagstuhl.de/opus/volltexte/2007/941.
10:36

Society 74. 57–74.
Kamp, Hans. 1978. Semantics versus pragmatics. In Franz Guenthner &
Siegfried J. Schmidt (eds.). Formal semantics and pragmatics for natural
languages. Dordrecht, Holland: Reidel. 255–287.
Klinedinst, Nathan. 2007. Plurality and possibility. UCLA, CA: PhD disserta-
tion.
Kratzer, Angelika. 1991. Modality. In Arnim von Stechow, Dieter Wunderlich
(eds.). Semantik: Ein internationales Handbuch der zeitgenössischen
Forschung. Berlin: De Gruyter. 639–650.
Kratzer, Angelika, and Shimoyama, Junko 2002. Indeterminate pronouns:
The view from Japanese. In Yukio Otsu (ed.). The proceedings of the
third Tokyo conference on psycholinguistics. Tokyo: Hituzi Syobo. 1–25.
Lokhorst, Gert-Jan C. 1997. Deontic linear logic with Petri net semantics.
Technical report, FICT (Center for the Philosophy of Information and
Communication Technology). Rotterdam. http://homepages.ipact.nl/
~lokhorst/deopetri.pdf.
Lokhorst, Gert-Jan C. 2006. Andersonian deontic logic, propositional quantifi-
cation, and Mally. Notre Dame Journal of Formal Logic 47(3). 385–395.
doi:10.1305/ndjfl/1163775445.
McNamara, Paul. 2006. Deontic Logic. In Dov M. Gabbay & John Woods (eds.).
Handbook of the history of logic, volume 7: Logic and the modalities
in the twentieth century. Amsterdam: Elsevier. 197-288. A version is
also available in the Stanford encyclopedia of philosophy. http://plato.
stanford.edu/entries/logic-deontic/.
Moortgat, Michael. 1997. Categorial Type Logics. In Johan van Benthem &
Alice ter Meulen (eds.). Handbook of logic and language. Cambridge,
MA: MIT Press. 93–177.
Portner, Paul. 2009a. Modality. Oxford, UK: Oxford University Press.
Portner, Paul. 2009b. Permission and choice. Georgetown University:
Manuscript.
Restall, Greg. 2000. An introduction to substructural logics. London: Rout-
ledge.
van Rooij, Robert. 2008. Towards a uniform analysis of any. Natural
10:37
Chris Barker
van Rooij, Robert. 2010. Conjunctive interpretation of disjunction. Semantics

and Pragmatics 3(11). doi:10.3765/sp.3.11.
Ross, Alf. 1941. Imperatives and logic. Theoria 7(1). 53–71.
Schulz, Katrin. 2005. A pragmatic solution for the paradox of free choice
permission. Synthese 147(2). 343–377. doi:10.1007/1-4020-4631-6_10.
Shan, Chung-chieh. 2004. Binding alongside Hamblin alternatives calls
for variable-free semantics. In Kazuha Watanabe & Robert B. Young
(eds.). Proceedings from Semantics and Linguistic Theory XIV. Cornell
University Press. 289–304.
Simons, Mandy. 2005. Dividing things up: the semantics of or and the
modal/or interaction. Natural Language Semantics 13(3). 271–316.
doi:10.1007/s11050-004-2900-7.
Wadler, Phil. 1993. A taste of Linear Logic. In Andrzej Borzyszkowski & Stefan
Sokolowski (eds.). Proceedings of the 18th international symposium on
mathematical foundations of computer science (Lecture Notes in Com-
puter Science Volume 711). Heidelberg: Springer. 185-210. doi:10.1007/3-
540-57182-5_12.
Zimmermann, Ede. 2000. Free choice disjunction and epistemic possibility.
Natural Language Semantics 8(4). 255-290. doi:10.1023/A:1011255819284.
Chris Barker
10 Washington Place
New York, NY 10003, USA
chris.barker@nyu.edu
http://homepages.nyu.edu/~cb125
10:38
doi: 10.3765/sp.3.6
The semantics and pragmatics of plurals
Donka F. Farkas Henriëtte E. de Swart

Department of Linguistics, Department of Modern Languages,
University of California at Santa Cruz Utrecht University

Decision 2009-07-07 / Revised 2009-09-18 / Third Decision 2009-10-07 / Revised
2009-10-30 / Accepted 2009-11-21 / Final Version Received 2010-01-06 / Published
2010-03-30
Abstract This paper addresses the semantics and pragmatics of singular

and plural nominals in languages that manifest a binary morphological
number distinction within this category. We review the main challenges
such an account has to meet, and develop an analysis which treats the plural
morpheme as semantically relevant, and the singular form as not contributing
any number restriction on its own but acquiring one when in competition
with the plural form. The competition between singular and plural nominals
is grounded in bidirectional optimization over form-meaning pairs. The main
conceptual advantage our proposal has over recent alternative accounts
is that it respects Horn’s ‘division of pragmatic labor’, in that it treats
morphologically marked forms as semantically marked, and morphologically
unmarked forms as semantically unmarked. In our account, plural forms
are polysemous between an exclusive plural sense, which enforces sum
reference, and an inclusive sense, which allows both atoms and sums as
possible witnesses. The analysis predicts that a plural form is pragmatically
appropriate only in case sum values are among the intended referents.
To account for the choice between these two senses in context we invoke
the Strongest Meaning Hypothesis, an independently motivated pragmatic
principle. Finally, we show how the approach we develop explains some
puzzling contrasts in number marking between English three/more children
and Hungarian három/több gyerek (‘three/more child’), a problem that has
not been properly accounted for in the literature so far.
Keywords: singular, plural, morphology, markedness, optimality theory, strongest

meaning hypothesis, Hungarian
©2010 Farkas and de Swart

Farkas and de Swart
1 Atoms, sums and the inclusive/exclusive sum interpretation
1.1 Inclusive and exclusive interpretations of the plural
The question addressed in this paper is a simple one: What is the difference in
meaning between singular and plural nominals in languages such as English,
where this distinction is morphologically marked?1 The issue then is to
characterize the semantic difference between the pair in (1), as it pertains to
information conveyed by the contrast in number.
(1) a. Mary saw a horse.

b. Mary saw horses.
A disarmingly simple answer would be to say that singular nominals (such

as a horse above) refer to one entity while plural nominals (such as horses)
refer to more than one entity. Recast in more technical terms based on Link
(1983), this answer is formulated in (2):
(2) a. Singular nominals refer within the domain of atoms.

b. Plural nominals refer within the domain of sums.
In Link’s proposal, the domain of entities from which nominals take values
has the structure of a join-semilattice whose atoms are ordinary individuals
(in this case individual horses) and whose non-atomic elements are all the
possible sums of more than one atom (in this case groups of more than one
horse). Under the simple view, when a nominal is singular, the domain from
which its referent is chosen is the set of atoms in the semilattice denoted
by its head noun, while in case it is plural, its reference domain is the set of
sums in that semilattice.
1 We use nominal here as a cover term for DPs and NPs. We limit the discussion to nominals
in regular argument position, and ignore special uses in predication, incorporation, etc.
(cf. de Swart & Zwarts 2009 and references therein for discussion of such constructions).
Among the languages that manifest a singular/plural morphological distinction are the
languages within the Germanic and Romance families as well as Finno-Ugric languages such
as Hungarian and Finnish. We will not deal here with languages that make more fine-grained
distinctions in number, involving duals or paucals (see Corbett 2000). Languages such
as Mandarin Chinese that lack morphological number distinctions are briefly taken into
consideration below but do not receive a full-fledged analysis in this paper but see Krifka
(1995) and Rullmann (2003) for relevant proposals. Nor do we go into issues concerning
non-morphological encoding of number information of the type discussed for Korean by
Kwon & Zribi-Hertz (2006) or for Papiamentu and Brazilian Portuguese by Kester & Schmitt
(2007).
6:2
Singulars and Plurals
The interpretation of the plural nominal in (1b) is labelled exclusive be-

cause its reference is restricted to sums, excluding atoms: (1b) is interpreted
as claiming that Mary saw more than one horse. The classical challenge for
the naïve view of the semantics of the plural is the existence of so-called
inclusive plurals, exemplified in (3a-c). These are plural forms whose inter-
pretation appears to be indifferent to the atom/sum divide in that the plural
nominal is allowed to range over both atoms and sums.
(3) a. Have you ever seen horses in this meadow?

b. If you have ever seen horses in this meadow, you should call us.
c. Sam has never seen horses in this meadow.
Thus, a yes answer to (3a) normally commits the speaker to having seen one
or more horses; in (3b), the addressee is expected to call even if she has seen
a single horse in the meadow, and (3c) is judged false in case Sam has seen a
single horse in the meadow. The existence of inclusive readings comes as an
unpleasant surprise to the naïve view, which predicts that the plural forms
in (3a)-(3c) are interpreted exclusively, just like the plural in (1b).
Note next that even though plurals may receive an inclusive interpreta-
tion in questions and within the scope of negation, as shown in (3a-c), the
distinction between singulars and plurals is not fully obliterated in these
environments. This is illustrated by (4a) and (4b) taken from Farkas (2006)
and Spector (2007) respectively, who note that the plural is distinctly odd in
these examples because normally people have only one nose and only one
father.
(4) a. Does Sam have a Roman nose/#Roman noses?

b. Jack doesn’t have a father/#fathers.
The contrast between (3a-c) and (4a-b) shows that a plural form remains
sensitive to the atom/sum distinction, even in environments where it can
be interpreted inclusively. A plural is always odd when sum values are
pragmatically excluded from its domain of reference. Ideally, this property
should follow from the account of the semantics and pragmatics of number
interpretation without any specific stipulations.
So far then we have established that an account of number interpretation
has to explain why plural forms are susceptible to both exclusive and inclusive
readings, and furthermore, one has to understand why particular linguistic
environments favor one or the other shade of meaning, while at the same time
predicting the sensitivity of plural forms to sum reference in all contexts. In
6:3
Farkas and de Swart
the rest of this section we establish some further conceptual and empirical
challenges an adequate account of number must meet and discuss some of
the most influential previous ways of dealing with them.
In Section 2, which contains the core of our proposal, we give a semantics
for the singular/plural contrast. In keeping with facts about overt morphology
in the languages under consideration, we do not make use of a singular
morpheme and therefore do not assign singular forms any inherent ‘singular’
semantics. The plural morpheme on the other hand is treated as contributing
a polysemous meaning, with the inclusive and exclusive interpretations being
its two related senses. The atomic reference of the singular comes about in
our account as a result of the competition between singular and plural forms
in the spirit of previous analyses but starting from opposite assumptions.
This competition is modelled in bidirectional Optimality Theory.
In Section 3 we account for the inclusive/exclusive interplay exemplified
by the contrast between (1b) and (3a-c) by exploiting the Strongest Meaning
Hypothesis, an independently motivated pragmatic principle. We also show
that the analysis we propose predicts that a plural form always requires the
possibility of sum witnesses, thus explaining the contrast in (4a-b) without
any extra stipulation. Section 4 shows how the analysis of languages like
English extends to an apparent puzzling use of singular forms with sum
reference in Hungarian, while Section 5 sums up the results of the paper.
1.2 The strong singular/weak plural view
An immediate solution to the inclusive plural problem illustrated in (3a-c)

is sketched in Krifka 1989. Plural forms, he suggests, are semantically in-
different to the atom/sum distinction while singular forms involve number
semantics that imposes atomic reference. In this view, the plural is seman-
tically “weak” in that it has no semantic contribution. The singular on the
other hand, is semantically “strong” in that it imposes an atomic reference
requirement. The ‘exclusive’ interpretation of the plural in sentences like (1b)
is due, in Krifka’s view, to a pragmatic blocking effect. The existence of the
semantically strong singular form blocks the use of the semantically weak
plural when atomic reference is intended because of a pragmatic rule that
forces the choice of a more specific form over a less specific one when the
two are equally complex. Since a singular nominal is more specific than its
plural counterpart, the singular has to be chosen whenever atomic reference
is meant, thus excluding an atomic interpretation for plural forms.
6:4
This idea is worked out in detail in Sauerland 2003 and Sauerland, An-
derssen & Yatsushiro 2005. In Sauerland et al. 2005, there are two number
features, SG and PL, located syntactically in the head of a φP node, as in figure
1, where *boy is a number-neutral predicate, insensitive to the atom/sum
distinction:
φP
φ DP
[SG/PL] D NP
the *boy
Figure 1 Number features in Sauerland et al. 2005
The proposed semantic contribution of the two number features is given

in (5):
(5) Semantics of the singular/plural in Sauerland et al. 2005:

a. SING(x) is defined only if #x = 1
SING(x) = x wherever it is defined
b. PLUR(x) is always defined
PLUR(x) = x wherever it is defined
In this approach, the plural feature is semantically weak because it con-

tributes nothing to the interpretation of the phrase it occurs in. The singular
feature, on the other hand, is semantically strong because it contributes a
presupposition of singleton (or atomic) reference.
The exclusive reading of plurals exemplified in (1b) is derived with the
help of the principle of Maximize Presupposition originally proposed in Heim
1991 to account for the non-uniqueness inference of indefinite DPs. If there
is a choice between two alternative morphemes that differ only in that one
has more presuppositions than the other, this principle requires speakers
to choose the morpheme that has the most presuppositions satisfied in
the context. Given the semantics in (5), Maximize Presupposition predicts
that the plural form in (1b) is interpreted as exclusive since if the atomic
presupposition of the singular had been met, Maximize Presupposition would
have mandated the use of the singular form.
6:5
Farkas and de Swart
In order to account for the inclusive interpretation of plural forms in

sentences such as (3a-c), Sauerland et al. weaken Maximize Presupposition as
in (6):
(6) Maximize Presupposition applies to the scope of an existential if this

strengthens the entire sentence.
Sauerland et al. treat indefinites as generalized quantifiers with existen-

tial force, and decompose no syntactically into an indefinite and negation.
Presupposition maximization applied to the scope of the existential adds a
condition that would make the entire utterance logically weaker when the
existential occurs in a downward entailing environment. The generalization
in (6) blocks this process, and thus predicts an inclusive reading for plural
indefinites within the scope of negation and more generally, in downward
entailing environments. There are, however, problems concerning the pre-
cise details of when and how Maximize Presupposition is suspended, for a
discussion of which we refer the reader to Spector (2007: 267–271).
A different account is argued for in Spector 2007. Spector also posits two
number features, a singular and a plural, but in this approach each has its
own semantic contribution. The semantics of the singular feature imposes
atomic reference while the semantics of the plural is inclusive. The exclusive
plural interpretation of (1b) comes about as the result of a second-order
scalar implicature denying the ‘exactly one’ reading of (1b)2 . For details and
contrasts in predictions between Sauerland’s position and Spector’s with
respect to bare plurals in non-monotonic and universally quantified contexts,
see Spector 2007.
The analysis we develop here shares with these earlier approaches the
insight that the competition between singular and plural forms drives their
interpretation in a process that intertwines semantics and pragmatics. The
crucial difference between these earlier approaches and ours is that we treat
the singular as semantically weak and the plural as semantically strong.
Krifka (1989) and Sauerland et al. (2005) use blocking to derive the interpre-
tation of the semantically weak plural given the existence of a semantically
strong singular, whereas we posit a semantically strong plural, and use
blocking to derive the interpretation of the semantically weak singular. This
reversal is worth striving for because it reconciles the semantics of number
with Horn’s division of pragmatic labor, an issue we turn to next.
2 Note that this implicature is independent of the suitability implicature that Cohen (2005)
proposes to distinguish bare plurals from plurals with an overt indefinite determiner.
6:6
1.3 Reconciling number semantics with the Horn pattern
Any account in which the singular feature makes a semantic contribution

while the plural does not forces one to distinguish between semantic and
morphological markedness. It has long been known that there is a strong
tendency for languages that have a singular/plural contrast in nominals
to morphologically mark plural forms and leave singular forms morpho-
logically unmarked (Greenberg 1966; Corbett 2000).3 Such languages have
a morpheme used in plural nominals but no special singular morpheme,
and therefore plural forms are morphologically marked while singulars are
not. But under a strong singular/weak plural view, it is the singular that
makes a semantic contribution while the plural is semantically vacuous.
Thus, in Sauerland et al. 2005 the singular is morphologically unmarked,
but semantically marked (cf. 5a), while the plural is morphologically marked
and semantically unmarked (cf. 5b).4 It is, in fact, this very tension be-
tween semantic and morphological markedness that made the existence of
inclusive plurals interesting in the first place. McCawley (1981) raises the
question of how to reconcile the morphology and the semantics of number
given the general tendency of language to pair morphologically unmarked
forms with semantically unmarked meanings. The morphological asymmetry
between singular and plural forms is also unexplained under the analysis in
Spector 2007 because in that account both singular and plural features are
semantically potent.
Following van Rooij (2004) and others, we call the fundamental connec-
tion between semantic and morphological markedness Horn’s division of
pragmatic labor or the Horn pattern, but note that this generalization has a
long history that reaches back long before Horn’s work. Citing structuralist
and Prague school views on markedness, Horn (2001: 155) describes it as
follows: “. . . one member of an opposed pair is literally marked (overtly
signaled) while the other is unmarked (signaled via the absence of an overt
signal). Semantically, the marked category is characterized by the presence
of some property P, while the corresponding unmarked category entails
3 Exceptions to this generalization exist, and are discussed in the literature. For instance, in
some ‘singulative’ languages such as Welsh, a singular morpheme is used in instances where
unmarked reference is to groups, or to unindividuated mass. Some instances of reversed
markedness are addressed in de Swart & Zwarts (2010). We will be concerned here with the
typologically most frequent pattern, where the plural is morphologically marked, but the
singular is not.
4 For further discussion of the special status of number within nominal φ features with
respect to semantic and morphological markedness within the assumptions of Sauerland
(2003) and Sauerland et al. (2005), see Sauerland (2008).
6:7
Farkas and de Swart
nothing about the presence or absence of P but is used chiefly (although not
exclusively) to indicate the absence of P (Jakobson 1939)”.
Any strong singular/weak plural analysis involves an anti-Horn pattern
because in such an approach the singular forms are assigned a strong seman-
tics (requiring atomic reference, which plays the role of P above), while plural
forms are given a weak interpretation, neutral with respect to whether values
are chosen among atoms or sums. Recently, Bale, Gagnon & Khanjian (in
press) have explicitly defended the anti-Horn pattern for number, claiming
that the empirical data are only reconcilable with a negative correlation be-
tween morphological and semantic markedness. A central goal of the present
paper is to challenge the anti-Horn view, and achieve a reconciliation of the
semantics and the morphology of number, formulated in A:
A. Plural forms should be semantically marked relative to singular forms

so as to preserve the correspondence between morphological and
semantic markedness seen elsewhere in language.
Analyses that are in line with the Horn pattern in the sense that they treat the
plural feature as making a semantic contribution while treating the singular as
semantically vacuous are called here weak singular/strong plural approaches.
They are preferable on theoretical grounds to their competitors because they
explain the asymmetry in number morphology in languages that have a plural
but no singular morpheme and thus reconcile morphological and semantic
markedness. Endowing the plural morpheme with a semantic contribution
and deriving the interpretation of singular forms from the absence of the
plural morpheme makes sense of the systematic morphological asymmetry
between singular and plural forms. The existence of inclusive plural readings
constitutes the main empirical challenge for the weak singular/strong plural
approach. Before we address this problem and offer a solution, we present
data from Hungarian that appears puzzling for a strong singular/weak plural
account but not for a weak singular/strong plural view.
1.4 Cross-linguistic challenges
In this subsection we review two sets of facts that add further challenges to
any account of number interpretation. The first comes from Hungarian, a
language that displays a pattern of number marking that raises an empirical
challenge to approaches that treat singular forms as requiring atomic refer-
ence. Just like English and other Indo-European languages, Hungarian has a
singular/plural distinction:
6:8
(7) a. Mari látott egy lovat. [Hungarian]

Mari saw a horse.acc
‘Mari saw a horse.’
b. Mari látott lovakat.
Mari saw horse.pl.acc
‘Mari saw horses.’
There is no special morphology marking singular forms, while the plural
feature is realized by the morpheme -(a)k.5 (8b) shows that in Hungarian, just
like in English, verbs must agree with their subjects in number, and that (a)k
realizes the plural feature on verbs as well:
(8) a. A gyerek elment.
the child leave.past
‘The child left.’
b. A gyerekek elmentek / *elment.
the child.pl leave.past.pl / leave.past
‘The children left.’
Hungarian is like English also in that plural nominals may have inclusive
uses. In (9a) for instance, the addressee is expected to give a positive answer
even if she saw a single horse, (9b) claims that Anna has not seen one or
more horses, and (9c) asks the addressee to say something if she saw one or
more horses.6
(9) a. Láttál valaha lovakat?
see.past.II ever horse.pl.acc
‘Have you ever seen horses?’
b. Anna nem láttot lovakat.
Anna not see.past horse.pl.acc
‘Anna hasn’t seen horses.’
c. Ha láttál valaha lovakat, szólj.
if see.past.II ever horse.pl.acc say.imp
‘If you have ever seen horses, say so.’
5 The vowel a is in parentheses here because in many phonological analyses it is treated

as epenthetic. The quality of this vowel is determined by vowel harmony as well as by
morphological considerations that are irrelevant for our purposes.
6 The contrast between inclusive and exclusive plurals in Hungarian is complicated by the fact
that in this language bare nominals (whether singular or plural) can incorporate, an issue
discussed at length in Farkas & de Swart 2003. Incorporated singulars are number neutral,
while incorporated plurals have sum reference. We will not be concerned with incorporated
nominals in this paper, but only note that our proposals are compatible with the analysis of
incorporation proposed by Farkas & de Swart (2003).
6:9
Farkas and de Swart
The problematic data concern DPs whose determiner entails reference to

sums, including but not limited to cardinals bigger than one, exemplified in
(10): 7
(10) a. három gyerek / *három gyerekek

three child / three child.pl
‘three children’
b. sok gyerek / *sok gyerekek
many child / many child.pl
‘many children’
c. mindenféle gyerek / *mindenféle gyerekek
all.kind child / all.kind child.pl
‘all kinds of children’
d. több gyerek / *több gyerekek
more child / more child.pl
‘more children’
e. egy pár gyerek / *egy pár gyerekek
a couple child / a couple child.pl
‘a couple of/some children’
As one can see from these examples, such DPs must be morphologically
singular. Note that these cases involve not only cardinal numerals but other
types of Ds as well. Therefore, no analysis specific to cardinals, such as
the one proposed in Ionin & Matushansky 2006, can cover all the relevant
examples. That these DPs are semantically plural can be seen from the fact
that they may occur as subjects of verbs like összegyülni ‘to gather’, as seen
in (11a). The fact that they are not necessarily distributive, and therefore that
they are, or at least, can be, referential is shown in (11b-d).8 The data are the
same for all the D types exemplified in (9).
(11) a. Sok gyerek gyűlt össze a téren.

many child gather.past part the square.on
‘Many children gathered in the square.’
b. Három gyerek felemelt egy zongorát.
three child lift.past a piano.acc
‘Three children lifted a piano.’
7 Hungarian is not the only language that displays this pattern of number marking, but it is
sufficient to work out the data for one particular language to make the relevant theoretical
point.
8 We are grateful to an anonymous reviewer for drawing our attention to the data in (11b-d).
6:10
(11) c. A három gyerek elérte a plafont.

the three child reach.past the ceiling.acc
‘The three children reached the ceiling.’
d. *Három gyereki azt hiszi, hogy ői a legjobb.
three child it.acc believe that III the best
‘Three childreni think that hei is the best’
Example (11b) is most naturally interpreted as involving a single lifting of

a single piano in which three children participated together. (11c) can be
interpreted as involving the reaching of the ceiling by one child helped along
by her two teammates. In (11d) we see that just like in the English equivalent,
*Three childreni think that hei is the best, these DPs cannot bind singular
pronouns. Note that these properties distinguish the DPs in (10) and (11)
from necessarily quantificational, non-referential DPs such as those headed
by each/mindegyik in English and Hungarian respectively.
The data in (10) are compatible with Rullmann & You’s (2003) semantics
of number, but their analysis of the Hungarian singular as number neutral
associates exclusive sum reference with the plural morpheme, and therefore
does not account for the inclusive interpretation of the plural in (9). The
analyses in Sauerland et al. 2005 and Spector 2007 have no problem with the
inclusive plural interpretation in (9) but the singular form of the DPs in (10)
is problematic, given that the atomic reference semantics that the singular is
crucially supposed to have is violated here.
Next, note that we cannot simply treat these forms as involving the
presence of a [Pl] feature within the nominal that happens not to be realized
on the head N. Such an analysis would predict that these DPs trigger plural
verb agreement when in subject position, a prediction that is not borne out,
as shown in (12):
(12) a. Három gyerek elment / *elmentek

three child leave.past / leave.past.pl
‘Three children left.’
b. Mindenféle gyerek jelentkezett / *jelentkeztek
all.kind child apply.past / apply.past.pl
‘All kinds of children applied.’
Finally, note that the DPs in (9) are similar to plural DPs in that discourse
pronouns referring back to them must be plural. If (13) is the continuation of
(12a) and if the three children are to be the antecedent of the direct object
6:11
Farkas and de Swart
pronoun, that pronoun must be plural.
(13) Mari nem látta őket / *őt.

Mari not see.past III.pl.acc / III.acc
‘Mari didn’t see them.’
These observations show that the DPs in (10) are semantically plural in
that they refer to sums but that morphologically, they are singular. This
characterization accounts for the data under the assumption that Subject-
Verb agreement is sensitive to the morphological feature of the DP, while the
form of a discourse pronoun is sensitive to the semantics of its antecedent
(see Farkas & Zec 1995 for discussion). The morphology explains the intra-
sentential agreement pattern these DPs trigger while their semantics explains
the form of the discourse pronouns for which they serve as antecedents.
We see then that in Hungarian, singular forms must be used in certain
cases of sum reference, a situation that is problematic for any strong singular
view. The challenge raised by the Hungarian data reviewed here is formulated
in B.
B. There are languages with a morphological singular/plural distinction

in nominals, where singular forms may have sum reference in case
sum reference is entailed by the determiner.
The account of the contrast between English and Hungarian we offer in

Section 4 below differs in empirical coverage from that found in Sauerland
et al. 2005 and Rullmann & You 2003 in that we capture the similarities
between the two languages when it comes to the interpretation of ordinary
plural forms in (7, 8, 10) as well as the differences between them when it
comes to the DP types exemplified in (9).
The second cross-linguistic empirical problem we consider is raised by
languages such as Mandarin Chinese that do not have a morphological con-
trast between singular and plural forms. Nominals unmarked for number in
such languages get a number neutral interpretation, as emphasised by Krifka
(1995), on the basis of examples such as (14):
(14) Wò kànjiàn xióng le. [Mandarin Chinese]

I see bear asp
‘I saw a bear/some bears.’
The empirical generalization we draw from contrasting the interpretation of

non-plural forms in English-type languages and Chinese type languages is
6:12
formulated in C.
C. In languages which lack morphological number marking on nominals,

unmarked forms are number neutral.
We capture this generalization below but will not work out the semantics of
Chinese nominals since we focus here on languages that have a morphological
number distinction. Our approach to these languages is compatible with
Rullmann & You’s semantics of Mandarin.
We have seen in this section that the naïve (and attractive) view of number
interpretation we started with, according to which singular nominals refer
to atoms and plural ones refer to sums, faces two stumbling blocks: (i) the
existence of cases in which plural nominals are interpreted inclusively; (ii) the
number marking system in languages like Hungarian, where certain singular
nominals must receive a non-atomic interpretation. Retreating to a view
according to which the singular is semantically potent while the plural is
semantically empty runs against the Horn pattern of markedness, and has
difficulty with the Hungarian data as well. In the remainder of this paper, we
work out an account of number interpretation which:
i. accounts for the existence of the inclusive as well as the exclusive

interpretation of plurals;
ii. respects the Horn pattern and is in line with the morphological
markedness facts (generalization A);
iii. predicts when inclusive interpretations are possible (1b vs. 3 and 7b
vs. 9);
iv. predicts the possibility of certain singular forms referring to sums in

languages like Hungarian (generalization B, examples 10–13);
v. is compatible with the number neutral interpretation of nominals in

languages that do not have a morphological mark for plurals, such as
Mandarin Chinese (generalization C, example 14).
We focus here on the semantics of number interpretation on nominals in

regular argument position (cf. footnote 1), and do not discuss issues con-
cerning the feature [Pl] when it occurs as an agreement feature on verbs and
VPs for instance. We concentrate on N-headed nominals, and leave detailed
discussion of pronouns and coordinate DPs for future work.
6:13
Farkas and de Swart
2 The semantics of singular and plural nominals
In this section we give our account of the interpretation of the plural feature
and its associated morpheme in the languages under consideration and
derive the interpretation of singular forms based on it. We start from what
we consider the null hypothesis, according to which, in languages with a
binary number distinction, there is a single, privative morphological feature
[Pl] in nominals and no singular feature. We assume that this feature is
generated in NumP, a node that is dominated by DP. We give the feature [Pl]
a polysemous semantics and derive the restriction of singular nominals (i.e.,
nominals that lack the feature [Pl]) to atomic reference under bidirectional
optimization. The bidirectional OT model we use is based on Mattausch
2005, 2007, a set-up that captures the harmonization of unmarked forms
with unmarked meanings, and of marked forms with marked meanings.9
2.1 Bi-directional optimization over form-meaning pairs
Our analysis is cast in the framework of Optimality Theory (OT), a theory that
defines well-formedness in terms of optimization over a set of output candi-
dates for a particular input. OT syntax, for instance, defines grammaticality
as the optimal form that conveys a particular meaning, and thus represents
the speaker orientation (production). OT semantics picks the optimal inter-
pretation of a given form as the meaning construed by the hearer for that
form (comprehension). Bidirectional OT deals with the syntax-semantics
interface by combining the two directions in an optimization process over
form-meaning pairs (Hendriks, de Hoop, de Swart & Zwarts 2010). This frame-
work is appropriate for the problem at hand because it allows us to treat
the interpretation of singular and plural nominals in tandem, as a matter of
competition between the two forms.
As was made clear in the previous section, we treat as fundamental
to the enterprise the fact that plural forms are morphologically marked
and singular forms are not. Bidirectional OT is particularly useful to us
because Mattausch (2005, 2007) has already worked out in this framework
an abstract way of modeling the association of forms and meanings as an
optimal communication strategy that captures the Horn pattern. His proposal
9 Mattausch’s work goes back to ideas developed by Jäger (2003) and Blutner (1998, 2000,
2004). For a slightly different bidirectional OT set-up, see Beaver 2002. For a comparison of
different bidirectional OT models, see Beaver & Lee 2004.
6:14
can be applied to our problem in a straightforward way.

The gist of Mattausch’s system is the following. Suppose there are two
forms, one overtly marked (m), the other unmarked (u) and suppose there
are two meanings, an unmarked (more frequent, or simpler) meaning α and a
marked (less frequent, or more complex) meaning β. Their combination leads
to four possible form-meaning pairs: hu, αi, hm, αi, hu, βi, hm, βi. How do
we determine which pairs are the optimal, most harmonic ones?
As a starting point, Mattausch posits four bias constraints, one for each
of the possible form-meaning pairings:
(15) Bias constraints

*m, α: the (marked) form m is not related to the (unmarked) mean-
ing α.
*m, β: the (marked) form m is not related to the (marked) meaning β.
*u, α: the (unmarked) form u is not related to the (unmarked) mean-
ing α.
*u, β: the (unmarked) form u is not related to the (marked) mean-
ing β.
These constraints penalize all possible form-meaning combinations. They

become operative only because they are differentially ranked relative to a
general markedness constraint, *Mark, a constraint that penalizes the use of
the marked form. This constraint models a notion of economy that prefers
simpler forms over more complex ones.
All constraints are soft, but the ease with which they can be violated
depends on their relative strength. The ranking of *Mark with respect to
the bias constraints reflects the balance of economy considerations relative
to faithful correspondence relations between forms and meanings in the
process of optimal communication. Mattausch derives the ranking of the bias
constraints relative to the markedness constraint from iterated learning over
several generations in a computational learning model based on frequency
distributions (cf. Kirby & Hurford 1997). Comparisons of forms and meanings
trigger the promotion or demotion of constraints. If the marked meaning is
less frequent, iterated learning over several generations with the four bias
constraints leads to a stochastic OT grammar in which the ranking of the
bias constraints mirrors the frequency distribution of the meanings α and β.
The central idea we take over from Mattausch’s work is that the relative
ordering of the bias constraints and the markedness constraint is such as to
result in an absolute preference for the association of the unmarked meaning
6:15
Farkas and de Swart
with the unmarked form and the association of marked meaning with the
marked form thereby capturing Horn’s division of pragmatic labor. The
universal constraint ranking Mattausch derives is given in (16):
(16) {*u, β; *m, α} *Mark {*u, α; *m, β}.
Marked forms always violate *Mark, so under the ranking in (16), they only
appear with the marked meaning β. Mattausch (2005, 2007) derives the emer-
gence of Horn’s division of pragmatic labor as the optimal communication
strategy that arises under evolutionary pressure.
2.2 Morphological and semantic markedness in the domain of number
Before we can apply Mattausch’s abstract model to the interpretation of

singular and plural nominals, we have to establish which forms and which
meanings correspond to u, m, α and β in the domain of number, and we
have to establish what the relevant markedness constraint is.
Concerning formal markedness, recall that we are concerned with the
typologically frequent pattern in which the plural is morphologically marked,
and the singular remains unmarked. Because of this asymmetry, the singular
is the unmarked form u, and the plural is the marked form m. On this point
the literature is in agreement: see Sauerland 2008 and Bale et al. (in press) for
recent discussion. We differ from previous approaches, however, in adopting
the null hypothesis and taking plural morphology to mark the presence of
the privative feature [Pl], and not positing a singular feature or a null singular
morpheme.
Establishing semantic markedness is a more delicate matter because there
are several distinct parameters along which it can be defined, besides fre-
quency. We mention here some major contenders. Denotational markedness
involves the subset relation between the denotation of an item i and that
of an item i’. For instance, the lexical item dog is denotationally unmarked
relative to the lexical item bitch. Conceptual markedness concerns the nature
of the denotation of two items: the denotation of the unmarked item i is
conceptually simpler than the denotation of the marked item i’. In temporal
semantics, for instance, the present is perceived as conceptually less marked
than the past and the future since the latter two are defined in terms of the
former (Lakoff 2000: 44). Finally, we distinguish a third type of semantic
markedness, semantic complexity according to which an item i is less marked
than an item i’ iff i’ is associated with a semantic requirement that is lacking
6:16
in i. For example, the definite article is semantically more complex than the
indefinite one under analyses where the definite article is associated with a
uniqueness requirement while the indefinite article is neutral in this respect
(see Heim 1991 and Farkas 2006).
In discussions of semantic markedness in the domain of number, the
notion of denotational markedness has dominated. As we have seen already,
Sauerland et al. (2005) take the plural to denote within the entire domain of
the nominal (= *N, cf. Figure 1 above) while Bale et al. (in press) assign the
plural feature an augmentative semantics, which takes the join of all atoms
and sums in the semi-lattice of N. As a result, the denotation of the singular
(which has to have atomic reference) is a strict subset of the denotation of
the plural in both proposals. The same is true for the account in Spector
2007. Approaches which rely on denotational markedness alone then lead
to an anti-Horn analysis. However, this is not the only way one can go in
relating number interpretation to the Horn pattern.
Our analysis of number is grounded in a notion of markedness in terms
of semantic complexity. No singular feature is posited for a singular nominal,
and no inherent number semantics is assigned to this form. Plural nominals
are assumed to involve an overt plural feature [Pl] realized by a plural mor-
pheme whose semantic contribution concerns the atom/sum distinction. If
a singular nominal is not inherently associated with any number semantics
while a plural nominal comes with such a constraint, the singular form quali-
fies as semantically unmarked relative to the plural with respect to semantic
complexity.
In addition, in terms of conceptual complexity, we take atomic reference
to be less marked than sum reference. We follow Link (1983) in taking the
domain of interpretation from which variables are assigned values to consist
of atoms and sums, where the latter are built from the former by means of
the join operation ⊕. Given that atoms may exist independently of sums, but
not the other way around, a nominal that denotes within the domain of sums
is conceptually more marked than one that denotes within the domain of
atoms only.
Support for this conceptual markedness view is found in psychological
research that points to the special nature of sum reference. Recent psy-
chological research suggests that non-human primates and children under
two represent small sets of objects as object-files, and do not establish a
singular-plural distinction based on atoms vs. sums (Hauser, Carey & Hauser
2000, Feigenson, Carey & Hauser 2002, Feigenson & Carey 2003, 2005). The
6:17
Farkas and de Swart
evidence comes from a variety of non-linguistic tasks.10 Wood, Kouider &

Carey (2004) and Kouider, Halberda, Wood & Carey (2006) assume that by the
time children learn the meanings of linguistic markers for the singular-plural
distinction, they must have distinguished between singletons and sets (or
sums, in our terms). They do indeed find that children over two who have
started to produce the plural marker understand that it signals reference to
multiple objects.11 Whereas representations of individual objects and object
arrays are available from ten months onward, the representation of multiple
objects as sums is paired up with the acquisition of linguistic markers of
plurality (around 24 months). The psycholinguistic evidence supports the
view that sum reference is conceptually marked as opposed to reference to
atoms.
In our view, the interpretation of nominal number concerns restricting
the domain from which witnesses of a nominal can be chosen in terms of
the atom/sum distinction. The conceptually marked reference is one that
includes sums, i.e., one that allows possible sum witnesses. Allowing sums
within the domain of reference is then the crucial markedness parameter
when it comes to number interpretation in the languages under considera-
tion. Consequently, the denotational space of nominals is divided into two
subdomains, one that includes sums and one that excludes them. Nominals
that refer in the latter domain have exclusive atom reference, while nominals
that refer within the former have sum reference. Next, note that there are
two ways in which a nominal can have sum reference: (i) its reference may be
restricted to sums (excluding atoms), a case we call exclusive sum reference,
and (ii) its reference may include sums but not exclude atoms, a case we call
inclusive sum reference. Thus, the formally marked plural form may fulfil
10 For example, infants watch while sets of crackers are placed into two different buckets.
When encouraged to crawl to one of the buckets, infants reliably choose the bucket with
more crackers with numbers up to three. When one set of crackers exceeds three (in four
vs. two, six vs. three or even four vs. one comparisons), infants up till 20 months are at
chance. All they would have to do to succeed on a one vs. four comparison is to represent
one as a singular individual and four as a plurality, but they fail to do so, and do not show a
preference for the bucket containing more crackers. The three-item limit is expected when
infants’ representation is object-based, as the object-file system is assumed to be subject to
the working memory limit of three to four items.
11 The experiments use a preferential looking paradigm, and tested sensitivity to number
expressed on the verb (is/are), on the noun (using nonsense words, e.g. the blicket/the
blickets) and with quantifiers (a blocket/some blickets). The results suggest that learning the
force of number marking on linguistic expressions strongly correlates with the conceptual
distinction between sets (sums) and individuals (atoms).
6:18
the requirement of being associated with the conceptually marked number

interpretation by being associated with either inclusive or with exclusive sum
reference. In the latter case all witnesses must be sums while in the former
case some witnesses must be sums. The formally unmarked singular form
is conceptually unmarked when its denotation excludes sum reference, i.e.,
when it denotes exclusively in the realm of atoms.
When it comes to denotational markedness, matters are complicated
because the domain of atoms is, of course, a subset of the domain of atoms
and sums. Note, however, that in our account, singular nominals do not
have a number feature of their own and thus do not involve inherent number
semantics. Plural forms on the other hand are treated as having a number
feature whose semantics forces them to include sums within their denotation
and thus a plural form cannot denote exclusively in the realm of atoms. We
suggest that because of the competition with plural forms, the denotation of
singular nominals ends up being restricted to the complement of the denota-
tion of plurals, i.e., singular nominals end up being interpreted as denoting in
the exclusive atom realm. More specifically, the account we propose involves
the formally and conceptually marked plural form blocking the formally
and conceptually unmarked singular form from being interpreted as having
sums within its domain. When the competition between the singular and the
plural is inoperative, however, singular forms are number neutral, and can,
in principle, denote in any subdomain of the lattice. Hungarian sum denoting
singulars are a case in point. What is impossible, according to our account,
is for a plural form to have exclusive atomic reference, a situation that is
indeed unattested as far as we know.
In the next subsection we make these proposals concrete by implementing
Mattausch’s abstract system of the pairing of form and meaning in the
domain of number. The core operative concept for number interpretation in
our system is conceptual markedness according to which atomic reference
is the unmarked meaning α, whereas sum reference (whether inclusive or
exclusive) is the marked meaning β.
2.3 Distribution of atomic/sum reference over singular/plural nominals
The forms we are dealing with in the bidirectional optimization process are
morphologically singular and plural nominals, which we denote by sg and pl.
Sg here is short for a DP that has no number feature in its NumP, while pl
is short for a nominal that has the feature [Pl] in NumP. The interpretations
6:19
Farkas and de Swart
associated with these forms are atomic reference and inclusive or exclusive
sum reference respectively, which we denote by at and i/e sum. The bias
constraints for number are given in (17):
(17) Bias constraints for number:

*pl, at: a plural nominal does not have atomic reference.
*pl, i/e sum: a plural nominal does not have inclusive/exclusive sum
reference.
*sg, at: a singular nominal does not have atomic reference.
*sg, i/e sum: a singular form does not have inclusive/exclusive sum
reference.
The markedness constraint on forms is *functN, the constraint proposed by

de Swart & Zwarts (2008, 2009, 2010) as the central economy constraint in
the nominal domain:
(18) *functN: avoid functional structure in the nominal domain
*functN prefers ‘bare’ nominals without articles, number morphology, clas-

sifiers, etc. over nominals involving the functional features or projections
that host these expressions. The elaborate structures we find in the nominal
domain support the view that *functN is a soft, violable constraint that
can be overruled by faithfulness constraints driving the expression of sum
reference, discourse referentiality, definiteness, etc. (cf. de Swart & Zwarts
2008, Hendriks et al. 2010: chapter 7). However, its influence is pervasive,
even in languages in which such faithfulness constraints are ranked high, as
argued by de Swart & Zwarts (2009) in relation to a range of bare nominal
constructions in Germanic and Romance languages.
The crucial ordering that emerges under the assumption that i/e sum
reference is semantically more marked than atomic reference is in (19):
(19) The Horn pattern for number

{*sg, i/e sum; *pl, at} *functN {*pl, i/e sum; *sg, at}
The ranking in (19) captures the insight that nominals marked with the
feature [Pl] must include sums within their domain of reference, and that the
interpretation of a singular form (when in competition with a plural) is atomic
reference. In line with Horn’s division of pragmatic labor, the ranking in (19)
pairs up marked plural forms with marked sum reference meanings, and
unmarked singular forms with unmarked atomic meanings. The optimization
over form-meaning pairs under this ranking is spelled out in the bidirectional
6:20
*sg, i/e sum *pl, at *functN *pl, i/e sum *sg, at

hsg, ati , ∗
hsg, at ∪ sumi ∗
hsg, sumi ∗
hpl, ati ∗ ∗
hpl, at ∪ sumi , ∗ ∗
hpl, sumi , ∗ ∗
Tableau 1 Optimization over singular/plural form-meaning pairs
Tableau1, where singular and plural forms are paired up with their respective
domain of interpretation in the lattice.
All possible form-meaning combinations are listed in the first column,
and constitute the input to the bidirectional optimization process. The
interpretations that particular forms are paired up with restrict the possible
witnesses of the nominal. Atomic reference, represented as at, limits possible
witnesses to atoms only. Exclusive sum reference, represented as sum, limits
possible witnesses to sums only. Inclusive sum reference, represented as
at ∪ sum, allows witnesses to be chosen both from the domain of atoms and
that of sums.
The four bias constraints, plus the markedness constraint *functN are
ranked across the top, where the left-right order reflects a decreasing order
of strength, and follows the ranking in (19). The two bias constraints *sg, i/e
sum and *pl, at are ranked above the markedness constraint *functN, but
their mutual order is irrelevant, which is reflected in the dotted line between
the two columns. Similarly, (19) requires the two constraints *pl, i/e sum and
*sg, at to be both ranked below *functN, but their mutual order is irrelevant,
as marked by the dotted line.
Because of the set-up with the bias constraints, all form-meaning combi-
nations incur one or more violations, marked by an asterisk ∗ in the relevant
cell. The schema in (19) ranks the bias constraints penalizing the combination
of singular forms with (inclusive or exclusive) sum reference and the combi-
nation of plural forms with atomic reference above the markedness constraint
*functN, which is what drives the optimization over form-meaning pairs in
Tableau 1. The constraints mitigating against the combination of plural forms
with (inclusive or exclusive) sum reference, or the combination of singular
6:21
Farkas and de Swart
forms with atomic reference are ranked below *functN, and are de facto
inactive in the optimization process.12
Tableau 1 shows that we assign the (unmarked) singular form the (un-
marked) meaning of atomic reference under strong bidirectional optimiza-
tion, because hsg, ati constitutes a bidirectionally optimal pair (,): there is
no better form to convey atomic reference, and there is no better meaning
to associate with a singular form. The expression of sum reference calls for
the use of a plural form. Both sum and at ∪ sum qualify as sum reference, so
plural forms have exclusive or inclusive sum reference. Accordingly, both
hpl, sumi and hpl, at ∪ sumi qualify as bidirectionally optimal pairs (,). Cru-
cially, however, a plural form cannot be used in case sums are not part of the
meaning to be expressed, because hpl, ati is suboptimal. In line with Horn’s
division of pragmatic labor then, unmarked forms pair up with unmarked
meanings, and marked forms pair up with marked meanings. Given this
analysis, singular nominals have exclusive atomic reference when in compe-
tition with the plural, while plural nominals have (inclusive or exclusive) sum
reference.13
We are proposing here a weak singular/strong plural account in which
plurals are formally marked with a feature that is interpreted in compositional
semantics, as spelled out in (20), while singular nominals have no explicit
number feature and are restricted to atomic reference only as a result of the
competition with the plural form. We capture this asymmetry by assuming
that the interpretation of the feature [Pl] is as given in (20), where *P is the
number neutral property denoted by the head noun and its complement (cf.
Section 1.2 above). For any given occurrence of a plural form, either (20a) or
12 Technically, either the set of four bias constraints or the combination of *FunctN with the
bias constraints ranked above it (i.e. *sg, i/e sum and *pl/at) is sufficient to obtain three
bidirectionally optimal pairs in the ordinal Tableau 1. That is, leaving out either *pl, i/e
sum and *sg, at or *FunctN would not change the outcome of the optimization process.
However, in Mattausch’s system, we need a markedness constraint in the learning system in
order to derive a 100% form-meaning distribution in the stochastic grammar. Note also that
*FunctN plays a key role in the unidirectional optimization in Section 4.
13 The crucial difference between λx [x ∈ Sum ∪ Atom & *P(x)] and λx *P(x) is precisely the
fact that the former is semantically plural, necessitating the possibility of sum reference
while the latter is number neutral and thus truly insensitive to the atom/sum divide. We
have seen in examples (4a-b) that the plural is indeed not insensitive to the atom/sum divide,
and will work out in section 3.3 an analysis of choice of form that brings out the relevance of
sum reference for plurals.
6:22
(20b) holds:
(20) a. Pl = λx λ*P [x ∈ Sum ∪ Atom & *P(x)]

b. Pl = λx λ*P [x ∈ Sum & *P(x)]
This interpretation ensures that the denotational space of plural nominals

will always include sums, whether inclusively, as in (20a), or exclusively,
as in (20b). It is this property that makes the plural forms marked relative
to singulars, in terms of Horn’s characterization of markedness in Section
1. Crucially, the two interpretations in (20) are semantically related since
(20b) asymmetrically entails (20a): whenever a witness meets the condition
in (20b), it also meets the condition in (20a) but not the other way around.
This, therefore, is a case of polysemy rather than one of arbitrary ambiguity.
The semantics in (20) leads to the truth conditions of sentences like (1b)
and a simplified version of (3a) (repeated here as 21a and 22a) as in (21b) and
(22b), respectively:
(21) a. Mary saw horses [exclusive plural]

b. ∃x : [x ∈ Sum & *Horse(x)] [See(m, x)]14
(22) a. Have you seen horses? [inclusive plural]
b. ?[∃x : [x ∈ Sum ∪ Atom & *Horse(x)] [See(addressee, x)]]
Thus, nominals with the feature [Pl] are incompatible with the conceptually
unmarked meaning, namely exclusive atomic reference. The reference of
such forms is restricted by the contribution of the feature [Pl] to a domain
that includes sums either to the exclusion of atoms or not. In Section 3 we
exploit the entailment relation between these two senses to account for the
pragmatic factors that play a role in choosing one sense over the other.
Since our analysis appeals to the feature [Pl] but has not implicated
particular determiners, it carries over straightforwardly to the definite plural
in (23a):
(23) a. Mary touched the horses.

b. ∃!x : [x ∈ Sum & *Horse(x)] [Touch(m, x)]
We assume that in English the definite article, as well as ‘definite’ possessive

determiners such as your horse/your horses have no number restrictions of
14 We use here and below First Order Predicate Logic formulas with restricted quantification.
We put square brackets around the Restrictor and the Nuclear Scope, and use capitals
to distinguish logical predicate constants from their natural language counterparts. We
disregard matters that are not directly relevant to us, such as tense interpretation.
6:23
Farkas and de Swart
their own since they combine with both singular and plural nominals. In
the case of the latter, the feature [Pl] is present in the NumP and brings its
contribution to the semantic interpretation of the DP. Given that we assume
NumP to be dominated by DP, we also assume that the feature [Pl], like other
agreement features, percolates to the DP in order to trigger plural agreement
outside the DP, as in the case of Subject-Verb agreement.
Our account leads us to expect inclusive plural possessive or definite
DPs alongside inclusive plural indefinites. Example (24) shows that this
expectation is met:
(24) [Instruction for parents picking up their kids from day care after an
outing in different groups]: If your children are back late, you have to
wait.
Your children in (24) is interpreted inclusively, for the instruction is assumed

to be relevant both to parents with a single child and to parents with more
than one child in day care. The inclusive interpretation of the plural posses-
sive in (24) is parallel to the inclusive interpretation of indefinite plurals in
the restrictor of conditionals, exemplified in (3b) and repeated here as (25):
(25) If you have ever seen horses in this meadow you should call us.
The plural definite in (23a), on the other hand, gets an exclusive plural
interpretation on a par with that of the bare plural in (21a).
Singular nominals do not involve a singular feature in NumP and therefore
they do not have an inherent denotation restriction concerning the atom/sum
divide imposed by any of their subparts. The denotation of the singular nom-
inal horse is the number neutral property λx[*Horse (x)], an interpretation
that is insensitive to the atom/sum divide. Crucially, however, we assume
that the interpretation of count nominals in argument position in languages
with morphological number has to involve information concerning the atomic
vs. sum nature of their referent. In other words, a nominal that introduces
a discourse referent (i.e, a nominal of type e) in these languages has to be
interpreted as giving information concerning the atom/sum nature of its
possible witnesses.
Under standard assumptions in Discourse Representation Theory, dis-
course referents are introduced at the point when the D combines with its
sister(cf. Kamp & Reyle 1993; Kamp & van Eijck 1996). Because number
restrictions target the possible values of discourse referents, we assume
that it is at this point that the presence of a number restriction becomes
6:24
relevant. In the case of plural nominals, the interpretation of the feature [Pl]
in NumP contributes the required number restriction. In the case of singular
nominals, however, there is no explicit number feature that can contribute
the required number information. When a singular nominal combines with a
determiner that itself is not specified for number, such as the definite article
the, number specification is contributed via the optimization mechanism
given above. Such a singular DP denotes exclusively within the set of atoms
because allowing reference to sums has to involve the presence of [Pl] in
NumP according to Tableau 1. Thus, at the point when a number neutral D
such as the combines with a morphologically singular sister nominal that has
no inherent number specification either, such as horse, the system of con-
straints in Tableau 1 enriches the interpretation of the DP with the constraint
x ∈ Atom imposing exclusive atomic reference on the DP because this is the
optimal number interpretation for a DP that is not marked with the feature
[Pl]. The compositional semantics yields no number requirement on its own
but in the absence of plural morphology, the DP will be interpreted as having
atomic reference.15
Note that our account of the interpretation of singular DPs is similar in
spirit to Krifka’s account of number interpretation for plural nominals. For
Krifka, singular DPs are marked for atomic reference and plural nominals
denote in the complement of the singular forms. For us, plural nominals
are marked for including sums in their reference domain, and singular DPs,
when in competition with plurals, denote in the complement of the plural
form, i.e., they are interpreted as having exclusive atom reference.
The truth conditions of sentences like (1a), repeated here as (26a), involv-
ing a singular form in competition with a plural one, are then as given in
(26b).
(26) a. Mary saw a horse.

b. ∃x : [x ∈ Atom & *Horse(x)] [Saw(m, x)]
The condition x ∈ Atom is present because nothing in the inherent semantics

of some horse specifies that sum reference is a possibility, and therefore the
15 We are assuming here that the OT system works hand in hand with composition rather
than applying at a particular point in the derivation of the interpretation of an expression.
The type of enrichment we use here is different from the ‘pragmatic enrichment’ proposed
most recently in Chierchia 2004, 2006; Chierchia, Fox & Spector 2008, which relies on a
covert exhaustification operator. Note also that our account of the singular/plural contrast
is different from that assumed in Chierchia 2004, 2006; Chierchia et al. 2008 in that in our
account singular forms do not involve a feature that imposes atomic reference.
6:25
Farkas and de Swart
restriction to atomic reference is imposed by the constraints in Tableau 1.

This condition then is not contributed by the presence of a particular piece
of morphology in (26a) but rather, by the absence of the feature [Pl].16
The analysis extends in a straightforward way to the definite singular
nominal in (27):
(27) a. Mary touched the horse.

b. ∃!x : [x ∈ Atom & *Horse(x)] [Touch(m, x)]
Note that the definite article requires uniqueness, whereas the indefinite
article just contributes existential quantification. We exploit this difference
in Section 3.2 below.
The account we proposed above allows explicit semantic information
to be contributed by an unmarked form that has no inherent semantics on
the basis of the competition with a marked form with a specific semantics.
The bidirectional OT system spells out the details of a blocking account in
the spirit of Krifka (1989) and Sauerland et al. (2005) with the important
difference that in our system the existence of the semantically and morpho-
logically marked plural form affects the interpretation of the semantically
and morphologically unmarked singular form rather than the reverse
The most important challenge for a weak singular/strong plural view
is the existence of inclusive readings of plural forms. The polysemous
semantics of plural nominals that we adopted in (20) as the outcome of the
bidirectional optimization process meets this challenge as it leaves room for
both inclusive and exclusive sum reference. Following the spirit though not
the letter of Sauerland et al. (2005) and Spector (2007), we rely on pragmatics
to determine the choice between these two senses in context and give, in the
next section, a pragmatic account of the contrast between (1b), (3a-c) and (4a,
b).
16 We assume here that the atom condition is part of the semantics of the relevant DPs.
Alternatively, one could treat it as an implicature whose generation would rely on the
constraints in Tableau 1. Under both views singular DPs are taken to denote within the realm
of atoms because languages that have a plural form have to use it, other things being equal,
in case sums are among the possible referents of the DP and thus, the existence of the plural
blocks the singular from being interpreted as having sum reference. The implicature analysis
sketched here differs sharply from the use of implicature in Spector (2007) summarized in
Section 1.2 above.
6:26
3 The pragmatics of the plural
So far we have worked out a weak singular/strong plural analysis of number

interpretation in which plural nominals are interpretable as having either
inclusive or exclusive sum reference. In Section 1 we saw that both inter-
pretations are indeed available for such nominals. We also saw, however,
that the choice between them is not free: exclusive sum reference is the rule
in upward entailing environments, as exemplified in (21) and (23), whereas
inclusive readings are typically found in downward entailing environments
such as the scope of negation, in the restrictor of a universal or the an-
tecedent of a conditional, as well as in questions (cf. 22, 24 and 25). In section
3.1 we turn to the problem of explaining this contrast. We suggest that a
crucial factor regulating the choice between the two senses of the plural is
the independently motivated S(trongest) M(eaning) H(ypothesis), a pragmatic
principle that can be constrained under contextual pressure. We discuss, in
Section 3.2, the predictions this hypothesis makes for the interpretation of
plural forms in quantificational contexts. The interpretation of singular and
plural forms comes closest in environments where the plural is interpreted
inclusively because in such cases both forms are compatible with atomic
witnesses. In Section 3.3 we investigate factors that regulate the choice be-
tween singular and plural forms in downward entailing environments and
questions, environments where a plural is most likely to receive an inclusive
interpretation.
3.1 Strongest meaning hypothesis for number
If plurals can have either an inclusive or an exclusive interpretation, along the

lines of (20), the question of how one chooses between these two possibilities
arises immediately. We have noted above that the exclusive interpretation
asymmetrically entails the inclusive one. This, we claim, makes the choice be-
tween the two interpretations sensitive to the Strongest Meaning Hypothesis.
In this section we make this connection explicit and discuss its predictions.
Recall that Dalrymple, Kanazawa, Kim, Mchombo & Peters (1998) propose
the Strongest Meaning Hypothesis (SMH) to account for the contextual choice
between a range of interpretations for reciprocals. Winter (2001) extends
the principle to instances of Boolean conjunction and quantification. Zwarts
(2004) exploits the SMH as part of his interpretation procedure for the
preposition round. We exploit here the same idea in claiming that the SMH
6:27
Farkas and de Swart
is one of the factors that govern the choice between the inclusive and the
exclusive sum interpretation of plural nominals.
The Strongest Meaning Hypothesis applies when an expression is assigned
a set of interpretations ordered by entailment and chooses the strongest
element of this set that is compatible with the context.17 The two senses of
the feature [Pl] in our account, given in (20), are ordered by (truth-conditional)
strength: an existentially closed proposition involving the exclusive sense
asymmetrically entails the same proposition involving the inclusive sense.
Because of this relationship the choice between interpretations of the [Pl]
falls under the jurisdiction of SMH. Our hypothesis is formulated as smh_pl
(the Strongest Meaning Hypothesis for Plurals):
smh_pl: the Strongest Meaning Hypothesis for Plurals: for a sen-

tence involving a plural nominal, prefer that interpretation of [Pl]
which leads to the stronger overall interpretation for the sentence
as a whole, unless this interpretation conflicts with the context of
utterance.
In upward entailing environments exemplified in (21a) and (23a), the sentence
under the exclusive interpretation of the plural entails the sentence under
the inclusive interpretation, and therefore the smh_pl favors the exclusive
interpretation of horses and the horses over the inclusive one. In other words,
the interpretation that Mary saw ‘more than one’ horse is stronger than the
claim that Mary saw ‘one or more’ horses, and therefore the smh_pl favors
the exclusive plural interpretation in (21b). Similarly, the statement that Mary
touched the maximal sum of horses in the context entails the proposition
that Mary touched the maximal set of one or more horses, so the smh_pl also
favours the exclusive plural interpretation in (23b).
In downward entailing environments on the other hand, the smh_pl leads
to the inclusive interpretation because of scale reversal under monotonicity
reversal (see Fauconnier 1979 and much subsequent work).18 The weaker,
inclusive reading of the plural in such contexts leads to a stronger claim for
the sentence as a whole. This indeed is the case for (3a-c, 22, 24, 25). With
respect to (3c), for instance, the proposition that Mary never saw ‘one or
more’ horses (inclusive plural) entails the proposition that Mary never saw
17 Note that Sauerland et al. (2005) also makes reference to strength when suspending Maximize
Presupposition in cases where disobeying this principle would lead to a stronger overall
claim, cf. (6) in Section 1.2.
18 Whenever relevant, the notion of downward entailment can be refined to ‘Strawson entail-
ment’, e.g. in conditionals (cf. von Fintel 1999).
6:28
more than one horse (exclusive plural). Given that the inclusive interpretation
of the plural in (3c) leads to a stronger claim for the negative sentence than
the exclusive interpretation, the former interpretation is preferred under the
smh_pl. We assume here that the smh_pl is relevant to bringing about the
inclusive interpretation of plurals in questions as well, though the details of
how to compute the strength of questions must remain an open issue for
the time being, despite the fact that the affinity between downward entailing
contexts and questions has been noted for a long time.19 Other things
being equal then, the smh_pl predicts that a plural nominal is interpreted
inclusively in downward entailing contexts and questions, and exclusively in
upward entailing ones. This is indeed the situation we find in (21)-(24).
Note that the SMH as advanced by Dalrymple et al. (1998), Winter (2001)
and Zwarts (2004) is a pragmatic principle, and as such it can be overridden
by contextual pressure. If the smh_pl is indeed responsible for the choice of
interpretation for plural nominals, we expect pragmatic pressure to render
it inoperative, and make inclusive interpretations available even in upward
entailing environments. We argue below that this is indeed the case.
Under the assumption that the speaker knows the facts, the plural form
in sentences such as (21a) and (23a) will receive an exclusive interpretation
which is informationally stronger than the inclusive one. Furthermore, in
these cases there is a single relevant witness for the plural nominal. Under
the assumption that the speaker is in full possession of the facts, she should
know whether this witness is an atom or a sum. In the first case she should
use a singular form because that is the best expression for conveying atomic
reference, given the high ranking of the constraint *pl, at (cf. Tableau 1).
In the latter case, she should use the plural form, given the equally high
ranking of *sg, i/e sum. Under the assumption that the speaker knows what
Mary saw/touched then, there is no possibility to weaken (21a) or (23a) to an
inclusive plural interpretation under the bidirectional optimization process
spelled out in Section 2. But in contexts where the speaker is assumed to
in fact lack information concerning the atomic/sum nature of the relevant
witness, the smh_pl no longer requires the exclusive reading and thus inclu-
sive readings of plurals become possible even in upward entailing contexts.
19 Obviously, questions are not generally perceived as downward entailing, but they are subject
to the same principle of scale reversal, as evidenced by the well-known fact that NPIs are
often licensed in all these environments (cf. Guerzoni & Sharvit 2007 for a fine-grained
discussion of NPI licensing in questions, and Ladusaw 1996 for a general overview of NPI
licensing).
6:29
Farkas and de Swart
When not in possession of the relevant information, the speaker may be

assumed to choose an inclusive plural form precisely because this relatively
weak statement (which allows both atoms and sums as possible witnesses)
is the strongest one compatible with the incomplete evidence she has. The
examples in (28) illustrate just such a case:20
(28) a. [Speaker walks into basement, and notices mouse droppings]:
Arghh, we have mice!
b. [Speaker walks into unknown house, and notices toys littering the
floor]: There are children in this house.
Crucially, the utterances in (28) are felicitous with an inclusive interpretation
only in situations in which the speaker finds positive indirect evidence for the
presence of mice and children, but has no way of telling how many there are.
Although the inclusive interpretation is weaker in upward entailing contexts,
it is the strongest possible interpretation of the sentence in a situation of
speaker ignorance, where both atomic and sum reference are compatible with
the information the speaker is assumed to have. The stronger, exclusive, in-
terpretation in this case is not supported by assumptions about the speaker’s
state of knowledge, and one prefers to assume that the speaker is obeying
the maxim of quality over assuming that she makes the strongest claim her
utterance is compatible with. Note that were the speaker to utter (28b) in
her own house, and thus be assumed to be in full possession of the facts,
the interpretation of the plural is correctly predicted to be exclusive again.
Analogous cases are discussed in Zwarts 2004 in terms of a constraint fit
(determining which interpretation fits the context) outranking the constraint
strength. We will not spell out the interpretative tableaux here, but refer
the reader to Zwarts 2004 for a way to do so.
3.2 Plurals in quantificational contexts
So far, we have proposed that the choice between the two senses of the plural
is influenced by monotonicity. In upward entailing contexts, a plural form is
normally interpreted as exclusive whereas in downward entailing contexts
and questions, scale reversal leads to an inclusive interpretation. This raises
the question of what happens in quantificational contexts.21
20 We thank one of the participants in ‘A bare workshop 2’ (LUSH, June 2008) for suggesting
the example in (28a).
21 We are grateful to an anonymous reviewer for suggesting to us to discuss the implications
of our analysis for plurals in quantificational contexts. We only discuss bare plurals and
6:30
If the smh_pl is indeed involved in the choice between the two senses
of the plural morpheme we expect, other things being equal, a difference
in interpretation of plurals depending on whether they are in the Restrictor
or the Nuclear Scope of a distributive universal quantifier because the Re-
strictor of such a quantifier is downward entailing and the Nuclear Scope is
upward entailing. We therefore expect the plural in (29) to favor an exclusive
reading:22
(29) Each sportsman is wearing gloves.
In order to test this prediction, we carried out a small-scale pilot ex-

periment. We set up a picture-matching task, in which participants were
requested to evaluate the Dutch counterpart of (29) in a ‘mixed’ situation in
which some sportsmen were wearing two gloves and others were wearing a
single glove. In order to neutralize the effect of expectations, each person
in the picture was wearing the correct number of gloves required by their
respective sport, so the boxer and the cyclist were wearing two gloves, and
the baseball player was wearing a single glove.23 Participants strongly re-
jected (29) as a correct description of such a mixed situation (23 out of 24
said ‘no’), confirming the prediction that a plural in the Nuclear Scope of a
definite plurals here, not plural some DPs; plural indefinite determiners will be discussed in
Section 4 below.
22 Note that our predictions here differ from those of Spector (2007), where the distinction
between the Restrictor and the Nuclear Scope of universal quantifiers is not assumed to be
relevant to the choice between inclusive and exclusive plurals. Note also that in order to rule
out cumulative and dependent plurals, which are possible in the case of (i) and (ii) we focus
on cases involving distributive each.
i. All children were sitting on small chairs.
ii. Unicycles have wheels.
See Zweig (2008) for relevant discussion of dependent plurals.

23 We thank Bert Le Bruyn for his help in designing and carrying out the experiment. The
experiment was carried out in Dutch. The singular/plural system in Dutch is parallel to
the one in English, and the contrasts between inclusive/exclusive interpretations are easily
reproduced in this language. Dutch iedere (‘each’) proved to be a good universal quantifier
to use because it is strongly distributive. One of the control items involved alle (‘all’), which
easily allows dependent/cumulative interpretations, just like its English counterpart. 24
native speakers served as subjects of the experiment. They were first-year BA students who
had just completed an introduction to linguistics, in which the semantics of plurals was not
discussed. The test was administered electronically. The participants were presented with
a picture and a sentence below it. They were asked to judge whether the sentence gave a
correct description of the situation (yes/no).
6:31
Farkas and de Swart
distributive universal favors an exclusive interpretation. As expected, the

control item All children were sitting on small chairs, where the quantifier all
gives rise to a cumulative interpretation of the bare plural small chairs, was
widely accepted as the description of a picture with a group of children each
sitting on their own small chair (22 out of 24 participants said ‘yes’). This
preliminary result appears to support the hypothesis that smh_pl is indeed
relevant in choosing between the two senses of [Pl]. Further experimental
work is needed in order to conclusively establish this point.
There is a further problem that arises in connection with plurals in the
Nuclear Scope of distributive universal quantifiers. It has been noted in the
literature that in (30), the definite plural gets an inclusive interpretation (cf.
Sauerland et al. 2005):
(30) Each boy invited his sisters
Sentence (30) can be used to describe a ‘mixed’ situation, in which each boy
invited all the sisters he has, which for some boys means inviting just one
sister while for others, it means inviting several. The question that arises is
how to account for the difference in interpretation between the bare plural
in (29), which seems to favor an exclusive interpretation, and the definite
plural in (30), which seems to allow an inclusive reading more readily. Section
2 developed a unified analysis of plural morphology so if bare plurals and
definite plurals behave differently here, the difference in our account can
only be due to the definite/indefinite contrast. Here we sketch a possible
explanation of the contrast in number interpretation based on the contrast
in definiteness.
The crucial difference, in our account, between (30) and (29) relates to the
contrast between (31) and (32):
(31) Each boy invited his sister

(32) Each boy invited a friend of his
The possessive singular in (31) is interpreted as definite and therefore as

referring to the maximal entity that is a sister of the relevant boy. Because
of the maximality requirement that is part of the semantics of definite
possessives, (31) is false24 in a situation in which some boys have one sister
and they invited her, while others have more than one sister and they invited
all of their sisters. The predicate invited his sister is true only of boys such
24 Or lacks a truth-value, if one prefers to state the maximality requirement as a presupposition.
6:32
that the maximal entity that is a sister of theirs is atomic because reference
to sum values is not allowed for a singular DP. A mixed situation in which
each boy invited the maximal entity that encompasses his sister(s) can be
described with a definite plural interpreted inclusively because in that case
the maximality requirement of the definite is met as long as for each boy
in question there is no sister that remains uninvited. The inclusive plural
requirement is met because although some witnesses are atoms there are
others which are sums. Note that, as expected, (30) cannot be used in case all
boys have a single sister whom they invited (cf. Section 3.3 below). The truth
conditions of the definite singular are incompatible with a mixed situation
where no sister is left uninvited and some boys have one sister while others
have more than one, while the truth conditions of the definite plural, under
the inclusive interpretation, are compatible with such a situation.
The indefinite singular on the other hand is truth conditionally compatible
with a mixed situation in which some boy invited one friend of his while
others invited several. This is because the predicate invited a friend of his can
be true of a boy that has several friends and invited only one of them precisely
because the indefinite, unlike the definite, has no maximality requirement. If
maximality is not part of the semantics of the sentence, the truth conditions
of a singular form are compatible with a mixed situation, where some boys
invited one of their friends and others invited several.
The contrast between (29) and (30) then is due to the fact that a ‘mixed
situation’ is incompatible with the truth conditions of (31) but compatible
with those of (32). Thus, the contextual pressure to override the smh_pl and
give the plural an inclusive interpretation when in the Nuclear Scope of a
distributive quantifier is stronger in the case of definites than in the case
of indefinites because for definites the singular form is truth conditionally
incompatible with a mixed situation while for indefinites this is not so.
We have claimed in this subsection that the choice between the inclusive
and exclusive senses of the plural is sensitive to the smh_pl, which favors
the exclusive interpretation of plural forms in ordinary upward entailing
environments and the inclusive interpretation in ordinary downward entailing
ones. Since the smh_pl is a pragmatic principle it can be overridden by
contextual factors involving cases where the speaker is assumed to describe
a ‘mixed situation’, one where some relevant witnesses are atoms and others
are sums. This may arise either because of speaker ignorance of the nature
of the relevant witness (as in 28) or because the speaker knows that both
types of witnesses are involved and using the plural form is the best way to
6:33
Farkas and de Swart
convey this information (as in 30). 25
3.3 Implications of the bidirectional analysis for choice of form
According to the account developed so far, the semantic contrast between

singular and plural forms is smallest in downward entailing contexts and
questions. In these environments a singular has atomic reference while a
plural has inclusive sum reference. Both forms therefore are compatible with
atom witnesses. The approach we worked out predicts that a plural form will
be appropriate in such contexts only if sum witnesses are relevant because
the OT system we set up predicts that the unmarked singular form is optimal
in case the witness domain does not include sums and therefore the use of
the marked plural is appropriate only in this latter case. In this section we
show that this prediction is confirmed and discuss some subtle pragmatic
factors that determine whether sum witnesses are relevant in the context or
not.
Our approach predicts that even in environments that lead to inclusive
interpretations, plural forms are sensitive to the presence of sums among
relevant witnesses. We have already seen that this prediction is borne out
in cases such as (4a,b) from Farkas 2006 and Spector 2007 respectively,
repeated here as (33a) and (33b):
(33) a. Does Sam have a Roman nose/#Roman noses?

b. Jack doesn’t have a father/#fathers.
In order to account for the contrast between singular and plural forms in
examples like (33), Spector assumes an additional modal presupposition
associated with (indefinite) plurals that explicitly requires the possibility of
a sum witness. The bidirectional OT analysis developed in Section 2.2 and
25 Spector (2007) raises a further empirical issue, namely the interpretation of the plural in
exemples such as (i).
i. Exactly one student bought wine bottles to the party.
This sentence is interpreted as claiming that one student brought more than one bottle
of wine to the party and no other student brought any bottles of wine to the party. This
is a problematic example for us because one and the same plural nominal appears to be
interpreted both exclusively (in the positive part of the interpretation) and inclusively (in
its negative part). This is a problem we leave open for the time being noting that a full
discussion would have to involve both the optimal interpretation of exactly and the way
closely related senses of polysemous items interact with it.
6:34
summed up in Tableau 1 accounts for this contrast without any additional

stipulations. The OT analysis does two things simultaneously: it pairs up the
singular and plural forms with their optimal interpretation, and at the same
time it spells out which forms are optimal to express reference to atoms,
to sums, and to a mixture of atoms and sums. Thus, according to Tableau
1, the optimal expression of exclusive atomic reference is a singular form
while in case reference to sums only or to sums as well as atoms is intended
the optimal form is a plural. Therefore, we predict that in case the witness
domain includes atoms only the plural form will be excluded since hsg, ati is
optimal while hpl, ati is not. The plural form has to be chosen when sums are
the only values in the domain of reference (exclusive plural), or when sums
and atoms are included in that domain (inclusive plural) but cannot be used
in case the domain of reference excludes sums. Thus, we predict that even in
environments in which the inclusive plural interpretation is preferred by the
smh_pl, the singular form will be chosen when the nominal is assumed to
take values exclusively from the set of atoms because in such a case the use
of the plural form is suboptimal.
How do we know whether sums are relevant in the context? Given that
the bidirectional OT analysis spells out the syntax-semantics interface for
number, we cannot expect it to determine when reference to sums is intended
by the speaker. That requires knowledge of the world and contextual knowl-
edge, and is therefore a matter of pragmatics.26 The remainder of this section
discusses some of the pragmatic factors that come into play to determine
whether reference to sums is assumed to be relevant or not.
Let us consider first the clear cases. As just mentioned, if sums are
pragmatically excluded from being possible witnesses for reasons of general
world knowledge, as in (33), our analysis predicts that the speaker will choose
a singular form and therefore we predict that the use of the plural form
in (33a, b) is inappropriate. If, on the other hand, atomic witnesses are
excluded for reasons of world knowledge, we predict that a plural form will
be appropriate and the singular will not, as confirmed by (34). The singular
form in (34b) is infelicitous because a singular form cannot be used if the
26 Although Spector (2007) doesn’t work this out in his paper, we take it that he also appeals
to contextual knowledge and knowledge of the world in order to explain why the modal
presupposition introduced by plural indefinites under his analysis is violated in cases like
(33a, b), but not in (3a-c).
6:35
Farkas and de Swart
pragmatically restricted domain of reference consists of sums alone.
(34) a. Does a dog have eyes?

b. #Does a dog have an eye?
These are clear cases since knowledge of the world tells us that people have
one nose and that eyes come in pairs. The relevance of sums, inherent to the
optimization over forms, accounts for the contrasts in (33) and (34).
There are, however, less clear cases, in which the issue of whether sums
are relevant is a more subtle pragmatic matter. Following Farkas (2006),
we adopt the hypothesis that in some cases there are default expectations
with respect to the atom vs. sum nature of relevant witnesses and that these
expectations affect the choice of a singular vs. a plural in environments
that are otherwise friendly to inclusive plural interpretations. The account
developed here accounts for this effect. To exemplify, note that when it comes
to a person having an MA degree, it is simply a default expectation that if
they have such a degree, they will have only one. Nothing stops people from
piling up multiple MA degrees in their academic career, so sum witnesses in
this case are not absolutely excluded. But normally, a person obtains just
a single MA degree, so sum witnesses are not among the expected, default
witnesses. Under the analysis developed above, we expect that the unmarked
way of inquiring whether a person has an MA degree is (35a), with a singular.
The question with the plural form is unusual because it explicitly requires
one to include sums among possible witnesses. Indeed (35b) suggests that
the speaker is inquiring after the possibility of having multiple MA degrees.
The use of the plural here signals deviation from default expectations.
(35) a. Do you have an MA degree?

b. Do you have MA degrees?
By contrast, when it comes to a department that has an MA program, the

default expectation is that there will be more than one MA student in it. Since
sums are now the default witnesses, we expect that the speaker will use a
plural in (36a), when inquiring whether the department has an MA program.
The choice of the singular in (36b) will be highly unusual, since it signals that
sum values are not among the expected witnesses, a situation that is highly
unexpected.
(36) a. Are there MA students in your department?

b. #Is there an MA student in your MA department?
6:36
The contrast between (35) and (36) is due to the difference in whether one
expects sums to be among the relevant witnesses or not. Since the choice of
a plural form always requires sum witnesses to be relevant, such a form is
natural in (36a) but is unusual in (35b).
The pragmatic relevance of sums also plays a role in (37a), the example
most frequently cited as support for the existence of an inclusive reading of
the plural.
(37) a. Do you have children?

b. Do you have a child on our baseball team?
The domain from which the nominal chooses witnesses in (37a) is a mixed one
since there is no default expectation with regard to how many children a per-
son has. In this case then sums are part of the pragmatically relevant domain
and therefore the choice of a plural form is predicted to be appropriate on a
tax form, for instance. In (37b) on the other hand, we changed the example so
that now the presence of sum values among the default witnesses is removed
and, as expected, a singular form is the natural one in a questionnaire in this
case.
In the examples discussed so far, common world knowledge shared
between speaker and hearer is sufficient to account for the optimal choice
of the singular or plural form. The two questions in (38) illustrate that the
choice of form may also depend on the context of use.
(38) a. Do you have a broom? (asked in your kitchen after I spilled peas
on your floor)
b. Do you have brooms? (asked in a store)
As far as we can see, there are no special expectations about people having
one or more than one broom in their house, if they have any. In addition,
given the context of use sketched for (38a), the speaker is not expected
to need more than one broom. The choice of a singular form is therefore
expected, since sum witnesses are not relevant to the situation of use nor
is the relevance of sums imposed by common world knowledge. In a store,
on the other hand, the relevant witness is by default a sum, since stores
normally sell more than one item of a particular type, if they sell that type
of item at all. A plural form then is the natural choice in (38b) not because
the speaker is interested in buying more than one broom but because of the
default sum value expectation associated with the positive answer to her
question.
6:37
Farkas and de Swart
Further examples that support the generalization that a plural form is

used in case sum values are among the default values of a nominal in a
particular context, and that singular forms are used when this is not the case
are given in (39)-(42):
(39) a. Is Sarah wearing shoes?
b. Is Sarah wearing a hat?
(40) a. Do you have pictures from your wedding?
b. Do you have a picture of Sarah in your wallet?
(41) a. Is there a sauna in this house?
b. Are there nice plants in the garden?
(42) a. Have you bought your Christmas presents already?
b. Have you bought a Christmas present for Aunt Sarah?
Under the bidirectional OT analysis developed in Section 2.2, the singular

form is the optimal choice when the domain of reference includes atomic
values only (cf. Tableau 1). The inclusive plural tolerates atoms in its domain
of reference, but the pair <pl, at> is suboptimal, because of the high ranking
of *pl, at. So even in questions and under negation, the choice of a plural form
requires that sums be included in the domain of the nominal. Sum witnesses
must be relevant in whichever way the context supports this (general world
knowledge or specific situational knowledge). Therefore, the use of a plural
in downward entailing contexts and questions will be natural just in case
intended sum reference can be pragmatically justified. The choice between
a singular and a plural form in contexts where the interpretations of the
singular and the plural overlap thus falls out naturally from our account.
3.4 Taking stock
In section 2, we took Horn’s division of pragmatic labor to heart, and devel-

oped an analysis of the singular/plural contrast in line with the view that
unmarked forms are paired up with unmarked meanings, and marked forms
with marked meanings. We made no use of a singular feature or morpheme
and assigned the plural a polysemous semantics (inclusive and exclusive sum
reference).
In Section 3 we invoked the Strongest Meaning Hypothesis to account for
the fact that plural forms in ordinary upward entailing environments are
normally interpreted exclusively while the best cases of inclusive plurals are
found in downward entailing environments and questions. In our analysis
6:38
the use of a plural form requires sum witnesses to be relevant, a property

that we have argued guides the choice between singular and plural forms
even in contexts where a plural is interpreted inclusively.
The analysis set up so far meets the desiderata (i)-(iii), formulated in
Section 1.4. What remains to be investigated is its cross-linguistic validation
(desiderata iv and v). A full-fledged analysis of languages such as Chinese,
which lack morphological number altogether, goes beyond the scope of
this paper, but note that our set-up is in line with a semantics of Chinese
nominals in terms of general number (Rullmann & You 2003). In contrast
with Farkas & de Swart (2003), we do not take atomic reference to be the
default interpretation for argument nominals in general and therefore the
current analysis is subtler than our earlier proposal. Crucially, in the current
account, the mechanism that associates atomic reference with non-plural
forms requires a morphological opposition between singular (unmarked) and
plural (marked) nominals in the language. In the next section we turn to the
contrast between English and Hungarian DPs in cases where the D lexically
entails sum reference. We argue that plural morphology can be absent in
such cases precisely because, given the semantics of the D, the contribution
of [Pl] is redundant. The difference between Hungarian and English then is
a matter of whether redundant plural morphology is required (English) or
prohibited (Hungarian). We work out a full-fledged account of this contrast
in Section 4.
4 Plural determiners: a cross-linguistic perspective
So far we have concentrated on the interpretive contribution of the morpho-

logical number feature [Pl] when it occurs in nominals in argument position
in languages like English, where a morphological distinction between sin-
gular and plural nominals is operative. In principle, determiners may also
encode information concerning the atom/sum divide. In English, the defi-
nite determiner the combines with both singular and plural nouns, so the
restriction to atomic or sum reference in the case of definite DPs is solely
encoded in morphological information located in NumP. Within the category
of indefinite DPs, just like in the case of definites, number interpretation
is primarily driven by the morphological singular/plural contrast realized
in NumP, though this contrast may be reinforced by determiner choice. A
DP headed by several must refer exclusively to sums, while a DP headed by
a(n) can only refer exclusively to atoms. The indefinite determiner some, on
6:39
Farkas and de Swart
the other hand, is like the definite article in that it has no inherent lexical
restrictions pertaining to number interpretation.
The core analysis set up in Sections 2 and 3, illustrated with English,
extends to other languages that have a morphological number distinction
such as Germanic or Romance languages as well as to non-Indo European
language such as Hungarian. In DPs whose determiner does not contribute
number information, we expect the effect of the feature [Pl] on the nominal
to be the same as in English. We now turn to the data noted in Section 1,
where we saw that English and Hungarian contrast in case the determiner is
lexically marked for sum reference.
4.1 A contrast between English and Hungarian
As outlined in Section 1.4, Hungarian is like English in distinguishing between

singular and plural nominals, with the singular remaining unmarked and
the plural being marked by the presence of the morpheme -(a)k. The facts
of number interpretation in English that we discussed so far are parallel
in Hungarian and therefore the analysis proposed for English extends to
Hungarian as well.
We have seen, however, that there is a crucial difference between the
two languages when it comes to DPs whose determiner is lexically marked
for sum reference. In English, such DPs are morphologically plural, while
in Hungarian they are morphologically singular. We repeat the key relevant
Hungarian facts in (43) (see examples in 9 above):
(43) a. három gyerek [Hungarian]

three child
‘three children’
b. sok gyerek
many child
‘many children’
These DPs are singular in form (and trigger singular agreement with the V
when in subject position), and yet they have exclusive sum reference. This
then is an environment where the semantic contrast between singular and
plural forms is neutralized in Hungarian.
What needs an explanation now is why in languages that have a mor-
phological number contrast if the D is marked for sum reference, we find
two options: (i) the language may require the number contrast to be mor-
6:40
phologically expressed by the presence of the feaute [Pl], as in the English

three children, many children or (ii) the language may require the number
contrast to stay morphologically unexpressed, as in the Hungarian három
gyerek, sok gyerek. Note that the difference between these two languages
is purely morphological since the semantic interpretation of the relevant
DPs is identical. We turn to an account of these facts after we review their
significance for competing analyses of number.
4.2 Implications for the weak/strong singular debate
If we analyze a determiner such as three as a generalized quantifier expressing

existential quantification over sums with cardinality of at least three, the
semantics of the DP three children / három gyerek is as in (44). We use [three
NP]sg as shorthand for the Hungarian case, where the DP is singular and
there is no plural feature in NumP and thus no plural suffix on N, and [three
NP]pl as shorthand for the English case, where NumP contains the feature [Pl]
overtly realized as a suffix on N.27
(44) a. [three NP]sg = λP ∃x : [*N(x) & P(x) & |x| ≥ 3]

(✓weak singular)
b. [three NP]sg = λP ∃x : [*N(x) & P(x) & x ∈ Atom & |x| ≥ 3]
(*strong singular)
c. [three NP]pl = λP ∃x : [*N(x) & P(x) & |x| ≥ 3]
(✓weak plural)
d. [three NP]pl = λP ∃x : [*N(x) & P(x) & x ∈ Sum & |x| ≥ 3]
(✓strong plural)
Note that in order to obtain the intended meaning, the fact that the nominal
is singular should have no interpretive consequence here. The weak inter-
pretation of singular nominals in (44a) yields the desired interpretation, but
the strong singular semantics in (44b) does not, which is why the Hungarian
facts are problematic for accounts in which singular forms are semantically
potent while being at least compatible with a ‘weak singular’ approach such
27 The semantics spelled out in (44) may be an oversimplification, given the more fine-grained
analyses of the differences in meaning between three children and at least three children
that have been offered in the recent literature (cf. Nouwen & Geurts 2007 and references
therein). However, the observations made in these works are tangential to the issues at
stake in this paper, because they focus on the role of the determiner, not the singular/plural
distinction on the noun. So we ignore these complications here.
6:41
Farkas and de Swart
as the one we propose. In our account a singular DP has no inherent atomic

reference requirement contributed by a singular feature. It only acquires
atomic reference when a number restriction is required and is not provided
otherwise. When the nominal is plural, on the other hand, both the weak and
the strong plural analyses yield the right interpretations. Under a weak plural
account, the plural feature does not contribute any number information,
while the determiner requires sum reference given the cardinality require-
ment, as in (44c). Under the strong plural analysis advocated here, the plural
morphology on the noun conveys sum reference in (44d). In this case the
semantic contribution of the feature [Pl] is redundant given that exclusive
sum reference is entailed by the semantic contribution of the determiner.
Thus plural morphology in the DP is redundant when the determiner conveys
sum reference, but it is not harmful.
The particular weak singular/strong plural analysis developed in Section
2 derives atomic reference for singular nominals under bidirectional opti-
mization. We crucially need this mechanism in Hungarian as well in order to
assign the correct interpretation of ordinary Hungarian singular and plural
definite and indefinite DPs such as a gyerek ‘the child’ and a gyerekek ‘the
children’, which behave just like their English counterparts. But precisely in
case the D entails sum reference, the semantic difference between singular
and plural forms is neutralized under the assumption that singulars have no
semantic import. Due to the semantics of the D, the semantic contribution
of the feature [Pl] is redundant. Because there is no crucial interpretive dif-
ference between [three NP]sg and [three NP]pl the bidirectional optimization
over form-meaning pairs spelled out in Section 2 above does not apply to
these cases. In view of the semantic equivalence between singular and plural
nominals in DPs headed by ‘semantically plural’ determiners, both English
[three NP]pl and Hungarian [három NP]sg are compatible with the analysis
developed so far in this paper. The competition between singular and plural
forms is inoperative precisely when there is no meaning contrast that could
be encoded by these two forms.
A singular nominal can be associated with sum reference when the pos-
sibility of atomic reference is excluded on independent grounds and when
the requirement that argument nominals be specified for number reference
is satisfied by the D. Note, however, that we predict that the reverse is not
possible. Since the plural has a semantic contribution to make, there can be
no language just like English or just like Hungarian except that a plural form
will be used in case the DP entails atomic reference. If the D excludes sum
6:42
reference, the use of the marked plural form is predicted to be impossible

and thus DPs like *one/a single children are ruled out in both English and
Hungarian.
So far we have explained how our approach accounts for the possibility
of a singular DP in case the D entails sum reference. What remains to be
explained is what dictates the choice between a singular and a plural form in
cases where the difference is semantically neutralized. In the languages under
consideration the choice between the two forms is not free: English requires
the use of the plural in such cases (*three child), while Hungarian requires the
use of a singular (*három gyerekek ‘three child_pl’). The question we address
next is what drives the choice between these two forms in the grammar. We
discuss it in some detail because, as far as we know, this issue has not been
addressed in the literature.
4.3 A unidirectional OT analysis
The contrast between English and Hungarian nominals headed by a semanti-

cally plural determiner instantiates a shallow syntactic difference that arises
when two forms exist in the language but their semantic difference is neu-
tralized. We view the presence of the feature [Pl] in the English three children,
many children as number agreement , resulting from a requirement that im-
poses the presence of the feature [Pl] on sum denoting nominals. Its absence
in the corresponding Hungarian három gyerek/sok gyerek is seen as a choice
dictated by economy considerations that militate against the use of marked
forms when redundant. Given that we posit the same semantics for English
and Hungarian plural indefinites, such a situation calls for a unidirectional
syntactic OT analysis that establishes a more fine-grained distinction within
the set of languages with a morphological number distinction. We embed
our analysis in an OT typology of number based on classical markedness and
faithfulness constraints.
Recall that in Section 2 we exploited the economy constraint *FunctN that
favours the least number of functional layers on top of the NP. If the plural
feature [Pl] lives in the functional projection of NumP and cardinals and
indefinite determiners live in D, the presence of such expressions constitutes
a violation of *FunctN. Such violations are motivated by the need to satisfy
faithfulness constraints that are ranked above *FunctN. One of these is the
constraint Fpl, favouring the expression of sum reference in a functional
6:43
Farkas and de Swart
∃!x : [x ∈ Sum & *Child(x)] fpl *functN

a gyerek
∗ ∗
the child
a gyerekek
+ ∗∗
the child.pl
Tableau 2 expressive optimization for definite plurals (Hungarian)
layer above NP.28
(45) Fpl: Sum reference must be encoded in the functional structure of the
nominal.
Languages that do not have a morphological singular/plural distinction

in nominals (such as Mandarin Chinese, cf. Section 1.4 above) rank Fpl
below *FunctN (see de Swart & Zwarts (2008, 2010)). The morphological
singular/plural distinction in both English and Hungarian is the result of
a grammar in which Fpl outranks *FunctN. But the formulation of Fpl is
more general, and allows the expression of number distinctions by other
elements in the functional layer above the NP in addition to [Pl] in NumP. If
we take the Hungarian determiners három, sok, and the other determiners
in (10), to satisfy Fpl, there is no reason to use the [Pl] feature in NumP. In
fact, the markedness constraint *FunctN forbids its realization given that in
the presence of a determiner that entails sum reference, the feature [Pl] is
redundant. Tableaux 2 and 3 illustrate the optimization process for the plural
definite a gyerekek (‘the children’) and the cardinal három gyerek (‘three
child’).
In both the definite plural and the cardinal plural, we find a violation
of *FunctN because of the presence of an expression in the functional
projection of the nominal.29 Given that the input meaning involves sum
reference, the high ranking of Fpl in the grammar of Hungarian requires
satisfaction of this constraint at the expense of the economy constraint
28 Note that the formulation of the constraint Fpl here is slightly different from that in de Swart
& Zwarts 2008, 2010, who did not deal with the complexities of cardinals and indefinite
plural determiners, but focused on ‘plain’ definites and indefinites.
29 The presence of the definite determiner a is licensed by a high ranking of the faithfulness
constraint fdef governing the expression of definiteness (see Hendriks et al. 2010: chapter
7, de Swart & Zwarts 2008, 2010).
6:44
∃!x : [x ∈ Sum & *Child(x)] fpl *functN

három gyerek
+ ∗
three child
három gyerekek
∗∗
three child.pl
Tableau 3 expressive optimization for plural cardinals (Hungarian)
*FunctN. The definite article a in Hungarian is similar to English the in that

it does not convey number information, so the optimal plural nominal form
incurs a second violation of *FunctN in Tableau 2. In Tableau 3, there is no
reason to use a plural form of the nominal, given that the lexical semantics
of the cardinal D három entails sum reference, and may therefore be taken to
satisfy Fpl. A singular form of the noun is more economical, and is therefore
preferred. In OT terms, the use of a singular nominal in combination with
a determiner that entails sum reference exemplifies the emergence of the
unmarked.
The ranking Fpl *FunctN is sufficient to account for Hungarian, but
does not yet capture the cross-linguistic contrast between Hungarian három
gyerek (‘three child’) and English three children. To capture the intuition
that the use of a plural nominal in English in these cases is motivated by
agreement in number between the plural determiner and the noun we posit
an additional constraint Maxpl.
(46) Maxpl: Mark with [Pl] nominals that have sum reference.
Unlike Fpl, Maxpl favours redundant marking of plural morphology within

the nominal, at the expense of extra violations of *FunctN. The advantage
of this multiplication of plural marking is the emphasis on sum reference.
Maxpl is inspired by de Swart’s (2006, 2010) analysis of negative concord in
terms of semantic agreement.30 We suggest that the use of plural nominals in
contexts in which the determiner already conveys sum reference and thereby
satisfies Fpl is governed by a high ranking of Maxpl. Under this analysis, the
30 de Swart posits a constraint fneg requiring faithfulness to the expression of negation, and
maxneg requiring a reflection of negation on an indefinite argument within the scope of
negation on the form of the nominal. The high ranking of maxneg in negative concord
languages leads to a multiplication of negative forms even in contexts in which they are not
needed to satisfy fneg, and thus convey semantic negation.
6:45
Farkas and de Swart
∃x : [*Child(x) & |x| ≥ 3] fpl maxpl *functN

three child ∗ ∗
three children + ∗∗
Tableau 4 expressive optimization for plural cardinal meaning (English)
grammar of Hungarian has the ranking Fpl *FunctN Maxpl, whereas

English exemplifies the grammar {Fpl, Maxpl} *FunctN. For Hungarian,
the introduction of the new constraint does not affect the optimization
patterns spelled out in Tableaux 2 and 3, because Maxpl is ranked too low
to have an effect. For English, the new ranking leads to the optimal form
three children for the expression of cardinality information over children, as
illustrated in Tableau 4.
Sum reference is entailed by the cardinal determiner three, so Fpl is
satisfied, just like in Hungarian. However, the constraint Maxpl maximizes
the expression of plurality by forcing it to appear in NumP as well. The
high ranking of this constraint in English leads to a preference of agreement
between the determiner and the nominal over a more economical form.
In Hungarian, the constraint Maxpl is ranked below *FunctN, where it is
inoperative.
Independent support in favor of our analysis comes from L1 acquisition.
Children acquiring a double negation language such as standard English
sometimes go through a phase in which they multiply negation as if they
were speaking a negative concord language. Along similar lines, Hungar-
ian children sometimes mistakenly use the form *három gyerekek (‘three
children’) before they acquire the grammatical három gyerek (‘three child’).
Even though anecdotal, this evidence suggests that child grammar favours
agreement both for negation and number marking.
The analysis of the contrast between Hungarian and English is not in
conflict with the bidirectional optimization process developed in Section
2, but rather, it covers a niche where the competition in form evades the
competition in meaning. With inherently plural determiners, the determiner
entails sum reference for the nominal as a whole, and thus makes irrelevant
the semantic competition between singular and plural forms. In languages
with a morphological singular/plural distinction, this creates room for a new
competition between unmarked (singular) and marked (plural) forms. In
the absence of a difference in meaning, the optimal expression is selected
6:46
on purely formal grounds. The competition here is between economy of

form (exemplified by Hungarian), and agreement between D and its sister
(exemplified by English).31
We conclude that Hungarian and English are both members of the class
of languages with a full-fledged morphological singular/plural distinction in
nominals, and a grammar in which Fpl is ranked above *functN. However,
there are subclasses within this general class, that exploit contrasts in form
for other purposes than to express a distinction between atomic and sum
reference. Given that agreement in number between determiner and its sister
is available only in languages with a morphological singular/plural distinction
(instantiating Fpl *FunctN), we predict such subtleties not to occur in
languages lacking number morphology.
5 Conclusion
The semantics and pragmatics of the plural in languages with a morphologi-

cal number distinction has been a problem on the semantics agenda since
McCawley (1981) raised the question of how to reconcile the morphological
markedness of the plural, with its seemingly unmarked semantics. The main
point of this paper is to propose a way of resolving this tension, and maintain
Horn’s division of pragmatic labor for number in natural language.
Recent accounts of number interpretation, stemming from Krifka (1989),
accept this tension, and attempt to explain it (Bale et al. in press). Recall
that Sauerland et al. (2005) rests on the assumption that singular forms are
marked with a singular feature that requires atomic reference while plural
forms involve a feature with no semantic contribution while in Spector 2007
singulars have the same ‘strong’ semantics while the plural feature is assigned
a weak semantics equivalent to ‘at least one’. In Sauerland et al. 2005 plural
forms have sum reference because the existence of the semantically more
specific singular form blocks their use in case of atomic reference. In Spector
2007 a similar result is achieved using higher order implicatures.
The approach we developed here shares with these previous proposals
the insight that number interpretation requires a competition-based account
31 We use the term ‘agreement’ here to cover not only cases where morphological features
are shared but also cases where the presence of a morphological feature on one node is
connected to the presence of a semantic constraint on another. Our account therefore is
compatible with a morphological treatment of English which does not use [Pl] as a feature
on Ds.
6:47
Farkas and de Swart
and involves the blocking of one form by the existence of the other. We couch
it in terms of bidirectional Optimality Theory because this framework is par-
ticularly suitable for capturing the phenomenon of blocking. In Bidirectional
OT, the syntax-semantics interface is defined in terms of optimization over
form-meaning pairs, making use of a mechanism that selects the optimal
meaning for a particular form, and the optimal form for a particular meaning.
The crucial novelty of this paper is that it reverses the direction of block-
ing. We have worked out a weak singular/strong plural account of number
interpretation for the languages under consideration, in which there is no
singular feature and no special semantics associated with singular forms
while plural forms are assumed to involve a semantically potent plural fea-
ture. The main conceptual advantage of such an approach is that it reconciles
semantic and formal markedness when it comes to number interpretation
and explains why in the languages under consideration there is a plural
morpheme but no special singular marking. The main empirical advantage of
our approach is that it predicts the possibility of using singular forms with
sum reference in case the semantic distinction between singular and plural
forms is neutralized, a possibility that is realized in Hungarian.
We have adopted the abstract system developed in Mattausch 2005, 2007
and adapted it to the morphology and semantics of number. Crucially, we
have suggested that the relevant semantic markedness parameter for the
languages under consideration is the distinction between the conceptually
unmarked atom reference and the conceptually marked inclusive or exclusive
sum reference.
The system we propose associates marked plural forms with marked sum
reference interpretation and unmarked singular forms with unmarked atomic
reference. The marked plural form is associated with the requirement that
sums be included among possible witnesses of the nominal, a requirement
that is realized by giving the feature [Pl] a polysemous semantics, with one
sense reserved for the exclusive interpretation and the other for the inclusive
interpretation. The unmarked singular form has no inherent semantics, but
under bidirectional optimization, it takes the complementary meaning of the
marked plural which is exclusive sum interpretation.
We have proposed a weak singular/strong plural approach in which formal
and interpretational markedness are parallel, a pattern we find elsewhere
in natural language. At the same time, our proposal meets the challenge
posed by the existence of plural forms interpreted inclusively. In fact, once
we adopt the view that having sum reference is the conceptually marked
6:48
interpretation, the account makes us expect plural forms to be used both for
inclusive and exclusive sum reference. What the system rules out, however,
is a plural form used when the existence of a sum witness is excluded. This
is a welcome result. The relevance of sum values to all uses of plurals in
the languages under consideration follows from our analysis without having
to assume a strong semantics for singulars (as in Sauerland et al. 2005) or
having to add a special modal presupposition for plurals (as in Spector 2007).
In our approach, just as in previous proposals, the competition between
the inclusive and the exclusive interpretation of plural forms is decided by
pragmatic rather than semantic factors. We have relied on applying the
Strongest Meaning Hypothesis to the interpretation of the plural, which
correctly predicts that plural forms will be interpreted exclusively in ordinary
upward entailing contexts and inclusively when under the scope of negation
or in the Restrictor of conditionals or distributive universals. But even in
contexts in which inclusive readings are permitted, sum reference must be
relevant. We have seen that subtle pragmatic factors determine in which
contexts and situations sum reference is relevant.
The main theoretical contribution of the account we developed here is
that it respects the Horn pattern while at the same time accounting for the
existence of inclusive plurals as well as for the main dividing line between
inclusive and exclusive plurals. We have shown here that such an account
is both possible and desirable. On the empirical side, our approach has
the advantage of accounting for the relevance of sum reference with plural
forms, as well as predicting the possibility of singular nominals with sum
reference just in case sum reference is imposed by D independently of what
is found in NumP. This is indeed the case of Hungarian singular DPs such as
sok gyerek ‘many child’. We have presented an account of these facts that
treats the singular form of these DPs as the result of the language valuing
functional economy over the pressure to mark sum reference uniformly with
the feature [Pl]. The obligatory plural forms of such DPs in English is due to
this language valuing uniform [Pl] marking of sum denoting DPs higher than
functional economy. The account we propose then meets what we take to be
the main challenges number semantics faces without having to rely on any
tools that are not independently motivated.
6:49
Farkas and de Swart
References
Bale, Alan, Michaël Gagnon & Hrayr Khanjian. in press. On the relationship
between morphological and semantic markedness: the case of plural
morphology. Journal of Morphology http://linguistics.concordia.ca/bale/
pdfs/Morphology%20paper.pdf.
Beaver, David. 2002. The optimization of discourse anaphora. Linguistics and
Philosophy 27(1). 3–56. doi:10.1023/B:LING.0000010796.76522.7a.
Beaver, David & Hanjung Lee. 2004. Input-output mismatches in OT. In
Reinhard Blutner & Henk Zeevat (eds.), Optimality theory and pragmat-
ics, 112–153. Palgrave/MacMillan. https://webspace.utexas.edu/dib97/
publications.html.
Blutner, Reinhard. 1998. Lexical pragmatics. Journal of Semantics 15(2).
115–162. doi:10.1093/jos/15.2.115.
Blutner, Reinhard. 2004. Pragmatics and the lexicon. In Laurence Horn &
Gregory Ward (eds.), Handbook of pragmatics, 488–514. Oxford: Blackwell.
Chierchia, Gennaro. 2004. Scalar implicatures, polarity phenomena and the
syntax/pragmatics interface. In A. Belletti (ed.), Structures and beyond,
39–103. Oxford: Oxford University Press.
Chierchia, Gennaro. 2006. Broaden your views: implicatures of domain
widening and the "logicality" of natural language. Linguistic Inquiry 37(4).
535–590. doi:10.1162/ling.2006.37.4.535.
Chierchia, Gennaro, Danny Fox & Benjamin Spector. 2008. The grammati-
cal view of scalar implicatures and the relation between semantics and
pragmatics. In Klaus von Heusinger, Claudia Maienborn & Paul Portner
(eds.), Semantics. An international handbook of natural language meaning,
Mouton de Gruyter, New York, NY.
Cohen, Ariel. 2005. More than bare existence: An implicature of existential
bare plurals. Journal of Semantics 22(4). 389–400. doi:10.1093/jos/ffh031.
Corbett, Greville G. 2000. Number. Cambridge University Press, Cambridge.
doi:10.2277/0521640164.
Dalrymple, Mary, Makoto Kanazawa, Yookyung Kim, Sam Mchombo & Stanley
Peters. 1998. Reciprocal expressions and the concept of reciprocity.
Linguistics and Philosophy 21(2). 159–210. doi:10.1023/A:1005330227480.
Farkas, Donka F. 2006. The unmarked determiner. In Svetlana Vogeleer &
Liliane Tasmowski de Rijk (eds.), Non-definiteness and plurality, 81–106.
6:50
John Benjamins, Amsterdam.

Farkas, Donka F. & Henriëtte E. de Swart. 2003. The semantics of incorporation:
from argument structure to discourse transparancy. CSLI Publications,
Stanford: CA.
Farkas, Donka F. & Draga Zec. 1995. Agreement and pronominal reference.
In Guglielmo Cinque & Giuliana Giusti (eds.), Advances in Roumanian
linguistics, 83–101. John Benjamins, Amsterdam.
Fauconnier, Gilles. 1979. Implication reversal in natural language. In Franz
Guenthner & Siegfried J. Schmidt (eds.), Formal semantics and pragmatics
for natural languages, Reidel.
Feigenson, Lisa & Susan Carey. 2003. Tracking individuals via object files:
evidence from infants’ manual search. Developmental Science 6(5). 568–
584. doi:10.1111/1467-7687.00313.
Feigenson, Lisa & Susan Carey. 2005. On the limits of infants’
quantification of small object arrays. Cognition 97(3). 295–313.
doi:10.1016/j.cognition.2004.09.010.
Feigenson, Lisa, Susan Carey & Marc Hauser. 2002. The representations
underlying infants’ choice of more: object files versus analog magnitudes.
Psychological Science 13(2). 150–156. doi:10.1111/1467-9280.00427.
von Fintel, Kai. 1999. NPI-licensing, Strawson entailment and context-
depencency. Journal of Semantics 16(2). 97–148. doi:10.1093/jos/16.2.97.
Greenberg, Joseph. 1966. Language universals. Mouton, the Hague.
Guerzoni, Elena & Yael Sharvit. 2007. A question of strength: on NPIs
in interrogative clauses. Linguistics and Philosophy 30(3). 361–391.
doi:10.1007/s10988-007-9014-x.
Hauser, Marc, Susan Carey & L.B. Hauser. 2000. Spontaneous number repre-
sentation in semi-free-ranging rhesus monkeys. In Royal society of london:
Biological sciences, vol. 267, 829–833. doi:10.1098/rspb.2000.1078.
Heim, Irene. 1991. Artikel und Definitheit. In Arnim von Stechow & Dieter
Wunderlich (eds.), Handbuch der Semantik, Berlin: de Gruyter.
Hendriks, Petra, Helen de Hoop, Henriëtte de Swart & Joost Zwarts. 2010.
Conflicts in interpretation. Equinox Publishing (in press), preprint http:
//www.let.rug.nl/~hendriks/conflict.htm.
Horn, Laurence. 2001. A natural history of negation. CSLI Publications,
Stanford: CA.
Ionin, Tania & Ora Matushansky. 2006. The composition of complex cardinals.
Journal of Semantics 23(4). 315–360. doi:10.1093/jos/ffl006.
Jäger, Gerhard. 2003. Learning constraint sub-hierarchies. In Reinhard
6:51
Farkas and de Swart
Blutner & Henk Zeevat (eds.), Pragmatics and optimality theory, 251–287.
Houndmills: Palgrave MacMillan.
Jakobson, Roman. 1939. Signe zéro. In Mélanges de linguistique offerts à
charles bally, Genève (also in Selected Writings II).
Kamp, Hans & Jan van Eijck. 1996. Representing discourse in context. In
Johan van Benthem & Alice ter Meulen (eds.), Handbook of logic and
linguistics, 179–237. Amsterdam: Elsevier.
Kamp, Hans & Uwe Reyle. 1993. From discourse to logic. Dordrecht: Kluwer
Academic Publishers.
Kester, Ellen-Petra & Christina Schmitt. 2007. Papiamentu and Brazilian
Portuguese: a comparative study of bare nominals. In Marlyse Babtista
& Jacqueline Guéron (eds.), Noun phrases in creole languages: a multi-
faceted approach, Amsterdam: Benjamins.
Kirby, Simon & Jim Hurford. 1997. The evolution of incremental learning:
language, development and critical periods. In Antonella Sorace, Caroline
Heycock & Richard Shillcock (eds.), Gala ’97 conference on language
acquisition, HCRC, Edinburgh University.
Kouider, Sid, Justin Halberda, Justin Wood & Susan Carey. 2006. Acquisition
of English number marking: the singular-plural distinction. Language
Learning and Development 2. 1–25. doi:10.1207/s15473341lld0201_1.
Krifka, Manfred. 1989. Nominal reference, temporal constitution and quantifi-
cation in event semantics. In Renate Bartsch, Johan van Benthem & Peter
van Emde Boas (eds.), Semantics and contextual expression, Dordrecht:
Foris publication.
Krifka, Manfred. 1995. Common nouns: a contrastive analysis of English and
Chinese. In Greg N. Carlson & Francis Jeffry Pelletier (eds.), The generic
book, 398–411. Chicago University Press. http://amor.rz.hu-berlin.de/
~h2816i3x/.
Kwon, Song-Nim & Anne Zribi-Hertz. 2006. Bare objects in Korean:
(pseudo)incorporation and (in)definiteness. In Svetlana Vogeleer & Lil-
iane Tasmowski de Rijk (eds.), Non-definiteness and plurality, 107–132.
John Benjamins, Amsterdam.
Ladusaw, William. 1996. Negation and polarity items. In Shalom Lappin
(ed.), The handbook of contemporary semantic theory, 321–341. Oxford:
Blackwell.
Lakoff, Robin. 2000. The language war. University of California Press,
Berkeley.
Link, Godehard. 1983. The logical analysis of plural and mass nouns: a lattice-
6:52
theoretic approach. In Rainer Bäuerle, Christoph Schwarze & Arnim von

Stechow (eds.), Meaning, use and interpretation of language, 302–323.
Berlin: de Gruyter.
Mattausch, Jason. 2005. On the optimization and grammaticalization of
anaphora: Humboldt University Berlin, published as ZAS Papers in Lin-
guistics 38 dissertation. http://www.zas.gwz-berlin.de/mitarb/homepage/
mattausch/Dissertation-Mattausch.pdf.
Mattausch, Jason. 2007. Optimality, bidirectionality and the evolution of
binding phenomena. Research on Language and Computation 5(1). 103–
131. doi:10.1007/s11168-006-9018-7.
McCawley, Jim. 1981. Everything that linguists have always wanted to know
about logic (but were ashamed to ask). Chicago: University of Chicago
Press (2nd edition 1993).
Nouwen, Rick & Bart Geurts. 2007. At least et al.: the semantics of scalar
modifiers. Language 83. 533–559. http://ncs.ruhosting.nl/bart/.
van Rooij, Robert. 2004. Signalling games select Horn strategies. Linguistics
and Philosophy 27(4). 493–527. doi:10.1023/B:LING.0000024403.88733.3f.
Rullmann, Hotze. 2003. Bound-variable pronouns and the semantics of
number. In Western conference on linguistics, vol. 14, 243–254. WECOL
2002. http://semanticsarchive.net/Archive/DM3ODk0N/.
Rullmann, Hotze & Aili You. 2003. General number and the semantics and
pragmatics of indefinite bare nouns in Mandarin Chinese. ms. UBC. http:
//semanticsarchive.net/Archive/jhlZTY3Y/.
Sauerland, Uli. 2003. A new semantics for number. In Rob Young & Yuping Zou
(eds.), Salt 13, vol. 13, 258–275. CLC publications. http://semanticsarchive.
net/Archive/TM0YjdiO/salt03paper.pdf.
Sauerland, Uli. 2008. On the semantic markedness of phi-features. In Harbour,
D. et al. (ed.), Phi theory, 57–83. Oxford: Oxford University Press. http:
//www.zas.gwz-berlin.de/home/sauerland/downloads.html.
Sauerland, Uli, J. Anderssen & J. Yatsushiro. 2005. The plural is semantically
unmarked. In Stephan Kepser & Marga Reis (eds.), Linguistic evidence, de
Gruyter. doi:10.1515/9783110197549.413.
Spector, Benjamin. 2007. Aspects of the pragmatics of plural morphology:
On higher-order implicatures. In Uli Sauerland & Penka Stateva (eds.),
Presuppositions and implicatures in compositional semantics, 243–281.
Palgrave/MacMillan. http://lumiere.ens.fr/~bspector/.
de Swart, Henriëtte. 2006. Marking and interpretation of negation: a bi-
directional OT approach. In Raffaella Zanuttini, Héctor Campos, Elena
6:53
Farkas and de Swart
Herburger & Paul Portner (eds.), Negation, tense and clausal architecture:
Cross-linguistic investigations, 199–218. Georgetown: Georgetown Univer-
sity Press. http://www.let.uu.nl/~Henriette.deSwart/personal/negot.pdf.
de Swart, Henriëtte. 2010. Expression and interpretation of negation: an OT
typology. Dordrecht: Springer (in press).
de Swart, Henriëtte & Joost Zwarts. 2008. Article use across languages: an
OT typology. In Atle Grønn (ed.), Sinn und Bedeutung, vol. 12, 628–644.
University of Oslo.
de Swart, Henriëtte & Joost Zwarts. 2009. Less form, more mean-
ing: why bare nominals are special. Lingua 119(2). 280–295.
doi:10.1016/j.lingua.2007.10.015.
de Swart, Henriëtte & Joost Zwarts. 2010. Optimization principles in the typol-
ogy of number and articles. In Bernd Heine & Heiko Narrog (eds.), Hand-
book of linguistic analysis, Oxford: Oxford University Press. http://www.
let.uu.nl/~Henriette.deSwart/personal/oupdeSwartZwartsmay08.pdf.
Winter, Yoad. 2001. Plural predication and the strongest meaning hypothesis.
Journal of Semantics 18(4). 333–365. doi:10.1093/jos/18.4.333.
Wood, Justin, Sid Kouider & Susan Carey. 2004. The emergence of sin-
gular/plural distinction. Poster presented at the biennial International
Conference on Infant Studies. http://www.wjh.harvard.edu/~lds/index.
html?carey.html.
Zwarts, Joost. 2004. Competition between word meanings: the polysemy of
around. In Sinn und Bedeutung, 349–360. Konstanz. http://www.let.uu.
nl/users/Joost.Zwarts/personal/.
Zweig, Eytan. 2008. Dependent plurals and plural meaning: NYU dissertation.
http://www-users.york.ac.uk/~ez506/.
Donka Farkas Henriëtte de Swart

Department of Linguistics Department of Modern Languages
University of California at Santa Cruz Utrecht University
Stevenson College Trans 10
1156 High Street 3512 HD Utrecht
Santa Cruz, CA 95064, USA The Netherlands
farkas@ucsc.edu h.deswart@uu.nl
6:54
doi: 10.3765/sp.3.2
Embedded Implicatures and Experimental Constraints:

A Reply to Geurts & Pouscoulous and Chemla∗
Uli Sauerland
Zentrum für Allgemeine
Sprachwissenschaft, Berlin

2009-12-08 / Published 2010-01-25
Abstract Experimental evidence on embedded implicatures by Chemla (2009b)

and Geurts & Pouscoulous (2009a) has fewer theoretical consequences than
assumed: On the one hand, the evidence successfully argues against oblig-
atory local implicature computation, which has however already been dis-
credited. On the other hand, the data are fully consistent with optional local
implicature computation.
Keywords: conversational implicature, embedded implicature, experimental prag-

matics, free-choice permission, truth dominance
Both Chemla (2009b) (C in the following) and Geurts & Pouscoulous (2009a)
(G&P in the following) in recent papers in this journal provide welcome new
experimental evidence on embedded implicatures. However, while their work
takes us a couple of steps closer to full understanding of the issue, I will
argue that in both papers the theoretical implications of the new data are
overstated and much work remains to be done.
Intuitively clear cases of embedded implicatures are examples like (1).
(1) a. If you ate some of the cookies and no one else at any, then there
must still be some left. (Levinson 2000: 205)
b. Mary solved the first problem or the second problem or both
problems. (Chierchia et al. 2008: (31))
∗ I thank Nicole Gotzner, Lisa Hartmann and the editors of this journal for their help with
this paper, and the German Research Foundation (DFG grant SA 925/1 in the Emmy Noether
Programm) for financial support.
©2010 Uli Sauerland

Uli Sauerland
Here the implicatures of some and or are part of the truth-conditional content
of an embedded sentence: the conditional in (1a), which is understood as
if you ate some and not all of the cookies, and a disjunct in (1b), which is
understood as Mary solved either the first problem or the second problem and
not both. In these examples, the sentence without the embedded implicature
would be either contradictory (If you ate some or all of the cookies, then there
must still be some left) or a violation of a pragmatic constraint (#Mary solved
at least one of the problems or both problems, see Singh 2008).
The question theorists of all stripes are faced with is whether and how to
integrate these phenomena into a general theory of implicatures, or at least
of quantity or scalar implicatures. Some narrower directions that have been
pursued to address the general question raised by embedded implicatures
are listed in (2).
(2) I. How frequently and under what conditions do embedded implica-

tures arise?
II. Are embedded implicatures a uniform phenomenon? Or more
specifically: How many mechanisms can give rise to embedded
implicatures?
III. Are embedded implicatures really implicatures? Or more specifi-
cally: When is the mechanism giving rise to embedded implicatures
the same one as the mechanism giving rise to global implicatures?
G&P and C both focus on the first of these directions, and take this
discussion as far as it can be taken presently. However, I argue that just
pursuing the first direction is insufficient to resolve the issues embedded
implicatures raise fully. In particular, I show that independent pragmatic
constraints — in particular, the constraint of Truth Dominance (Meyer &
Sauerland 2009) — predict conditions on when embedded implicatures can
be detected that are largely independent of the account of embedded impli-
catures assumed. Therefore, the observations on the presence and absence
of (embedded) implicatures by G&P and C are consistent with much wider
range of theories of implicatures than what the original papers say. In the
second section of the paper, I address a finding by C on the embedding of
free-choice effects that speaks to directions II and III of (2). I argue that this
finding is more significant for the account of embedded implicatures and
speculate on two theoretical ideas that would account for it. I conclude that
C’s second result is the most important one for the theory of implicatures
from these two papers.
2:2
Embedded Implicatures and Experimental Constraints
1 Embedded Implicatures are Still Repairs
G&P are exclusively concerned with the first issue of (2). The primary target of
G&P is an extreme view of localism espoused by Levinson (2000) and Chierchia
(2004).1 The Levinson/Chierchia view predicts that implicatures should
always be fully local unless a cancellation mechanism applies. A number
of people have noted that this prediction seems to be intuitively wrong in
many cases. Specifically, this holds in case the embedded implicature is
not needed to make the sentence coherent (see for instance Geurts 2009;
Russell 2006; Sauerland 2004b). Compare (3) with (1): Intuitively, (3a) does
not seem to mean the same as If you ate some but not all of the cookies,
then you must have liked them. And for the multiple disjunction in (3b), the
paraphrase Mary solved either exactly one or all three of the problems, which
local computation of implicatures predicts, is clearly off the mark.
(3) a. If you ate some of the cookies, then you must have liked them.
b. Mary solved the first problem or the second problem or the third
problem.
In these two examples, the addition of a local implicature to the truth

conditions results in weaker truth conditions for the entire sentences — in
(3a) the implicature trigger occurs in a downward entailing environment, but
not in (3b). Such examples show that local implicatures cannot be obligatory.2
Examples where local implicatures would cause truth conditions that are
stronger overall are the focus of G&P. The data presented by G&P shows to
my full satisfaction that the prediction of the proposals of Levinson and
Chierchia is wrong also for these cases. Their result is also consistent with
other experimental results presented by Chemla (2009a), Schwarz, Clifton
& Frazier (2008) and Bezuidenhout, Morris & Widman (2009), who look
at different environments: mostly downward entailing cases like negative
attitude verbs, the restrictor of a universal quantifier and conditional, but
Bezuidenhout et al. (p. 139) also look at the scope of conditionals and present
some findings similar to G&P, though less striking. In sum, the proposals of
1 I am not sure whether any researcher active in this area still holds this extreme view: The
unpublished paper by Chierchia et al. (2008) that G&P cite seems not fully consistent to me
in this regard (Sauerland submitted): initially it adopts the view of Chierchia (2004), however,
later the quite different view of Fox (2007) is assumed without any comment on the shift.
2 Acknowledging this problem, Chierchia (2004) proposes that cancellation of implicatures is
obligatory in downward entailing environments. Still, (3b) remains a problem for Chierchia’s
proposal.
2:3
Uli Sauerland
Levinson (2000) and Chierchia (2004) are falsified by the experimental data
to the extent possible.3
G&P claim their results also argue against another view, which they call
Minimal Conventionalism. However, I will show that G&P are mistaken:
Actually, their result says nothing about Minimal Conventionalism once we
take into account general pragmatic constraints on how ambiguous sentences
are judged. To show this, I consider the view of Fox (2007) as a concrete
example of G&P’s Minimal Conventionalism. I motivate the general pragmatic
principle of Truth Dominance and then argue that G&P’s data are fully
consistent with Fox’s (2007) account once Truth Dominance is taken into
account.
Fox’s (2007) account is non-committal on the locality of implicature com-
putation, allowing it to apply locally, but also globally. He assumes that
implicatures can be contributed to the meaning of a sentence by the gram-
matical operator Exh.4 (Fox 2007: p. 79 & p. 97) defines the Exh operator via
the three statements in (4) through (6) (with minor notational adjustments).
The operator depends on a contextually provided set of alternative proposi-
tions C, which can be taken to be the scalar alternatives of the argument of
Exh in the examples in the following.
(4) NWC (p) = {q ∈ C | p does not entail q}

(5) q is innocently excludable given C if and only if
¬∃q0 ∈ NWC (p) [p ∧ ¬q] → q0

(6) ExhC (p)(w) a p(w)

& ∀q ∈ NWC (p) [q is innocently excludable given C] → ¬q(w)
Consider now Fox’s (2007) account for example (7). The account predicts
an ambiguity between a local+global reading, which corresponds to structure
(8a), and a global-only reading, which corresponds to structure (8b).
(7) All the squares are connected with some of the circles. (G&P: (26a))
3 Of course, there are always ways to save any scientific theory by adding additional assump-
tions, but nothing short of almost obligatory local cancellation of the proposed obligatory
local implicatures would seem to do the trick in this case.
4 There are two major differences between Fox’s account and that of Chierchia (2004): First,
Fox does not require local application of implicature computation. And second, his Exh
operator is different from Chierchia’s due to the appeal to innocent excludability. The second
difference does not matter for the following, but is important for the analysis of disjunctions
(Sauerland submitted).
2:4
(8) a. Exh All the squares λx Exh x be connected with some of the
circles.
b. Exh All the squares λx x be connected with some of the circles.
The two readings stand in a special logical relationship: the local-global

reading logically entails the weaker, global-only reading. From work on scope
ambiguity resolution, it is independently known that speakers’ intuitions are
affected by the entailment relation between the two readings. I adopt the
principle Truth Dominance from Meyer & Sauerland (2009) to account for this
effect because one case they consider is exactly analogous to (8),5 namely, the
German example (9). Most theories of quantifier scope in German predict (9)
to be ambiguous between two structural representations that should give rise
to the two readings given below (9). However, previous researchers (Büring &
Hartmann 2001; Reis 2005) have noted that (9) seems to lack the second one
of these readings: the reading where the postverbal subject takes scope over
the sentence-initial object (the every only reading in (9)).
(9) Nur Maria liebt jeder.

only Mary[acc] loves everyone.nom
only every: ∀y (y = Mary ↔ ∀x love(x, y))
[every only: ∀x ∀y (y = Mary ↔ love(x, y))]
Meyer & Sauerland (2009) explain the lack of evidence for the every
only reading by arguing that this reading cannot be detected for pragmatic
reasons. Specifically, the Truth Dominance principle in (10) predicts it to be
undetectable: Because the strong, only every reading entails the weak,
every only reading, any situation where the truth values of the two readings
differ is one where the strong reading is false while the weak one is true. But
Truth Dominance predicts that in such a situation, speakers will judge the
sentence to be true, as it’s predicted to be by the weak reading. The strong
reading therefore remains undetectable in the truth conditions of (9).
(10) Truth Dominance: Whenever an ambiguous sentence S is true in a

situation on its most accessible reading, we must judge sentence S to
be true in that situation. (Meyer & Sauerland 2009: (1))
5 The principle can be traced back at least to work on wide scope indefinites by Abusch (1994).
Gualmini, Hulsey, Hacquard & Fox (2008) call a similar principle Charity. The differences
between the Charity and Truth Dominance are not relevant to the discussion in this paper. In
fact, Charity would make exactly the same predictions as Truth Dominance for the examples
in the following.
2:5
Uli Sauerland
Principle (10) is a well-supported pragmatic principle: As already men-

tioned, work by Abusch (1994) and Gualmini et al. (2008) provides further
support for a principle like (10) and in addition, principle (10) makes prag-
matic sense as a principle of cooperative behavior in discourse.
Principle (10) is directly relevant for determining the predictions of Fox’s
(2007) analysis of implicatures in the following way. As discussed above, Fox’s
account predicts (7) to be ambiguous between the two readings represented
in (8). However, reading (8a) entails (8b), so the same situation obtains as
with the two readings of (9). Principle (10) entails for (7) that only reading
(8b) can be empirically detected for (7).6
Indeed, G&P argue that only reading (8b) is empirically supported by
the judgments of the subjects in their experiments, which is the judgment
that Fox’s ambiguity account together with Truth Dominance predict. The
experimental results of G&P are therefore fully consistent with Fox’s proposal
and what G&P call Minimal Conventionalism more generally.
The preceding discussion does not entail that Fox’s (2007) account is
without problems: Fox’s account makes the wrong prediction for cases
like (3) because the local application of implicature computation leads to
a weaker interpretation than the one actually attested. In this case, the
local implicature would be detectable, but is actually not attested. Fox
briefly entertains two suggestions that would address this shortcoming (Fox
2007: page 82), but both fall short. One suggestion is to only compute local
implicatures if they strengthen the sentence meaning. The other suggestion
is to only permit global application of his Exh operator. Both of these
suggestions solve the problem of (3), but leave Fox with no account for (1).
So, Fox’s (2007) account would need to be amended further. For example,
the empirical problems would be solved by stipulating that embedded Exh is
blocked unless an inconsistency or pragmatic violation results otherwise.7
The important point for our present purposes, though, is that G&P’s data
do not bear on Fox’s (2007) account and probably others that G&P would
characterize as Minimal Conventionalism. The above discussion also shows
how difficult it is to address the puzzle posed by embedded implicatures
by just looking at the distribution of embedded implicatures. I conclude
6 To be more precise, the application of Truth Dominance here assumes that (8b) represents a
more accessible syntactic parse than (8a). Since (8b) contains fewer silent operators, this
assumption is independently justified.
7 This is essentially a more specific statement of the view of Sauerland (2004a) that embedded
implicatures are a repair strategy.
2:6
therefore that a comparison of the properties of embedded implicatures with

non-embedded ones may be a more promising direction to pursue than to
solely focus on the distribution of embedded implicatures. Such attention to
the properties of implicatures would address both II and III of the questions
in (2). In the next section, I focus on one aspect of the data reported by C
that points in this direction.
2 Are Embedded Implicatures Implicatures?
The results of C (= Chemla 2009b) add one new aspect, but are otherwise
consistent with the picture already summarized: The results show that
obligatory localism is false, but don’t distinguish between other views. In
particular, C’s discussion of examples like (11a) and (11b) is limited in the same
way as the discussion of (9) by G&P: the only theory Chemla’s result argues
against is the extreme localism of Levinson (2000) and Chierchia (2004),
which is already known to have numerous problems. More viable views of
localism, where embedded implicatures are an option, but not required, make
exactly the right predictions for both examples in (11) — namely, the same
predictions as a global account.
(11) a. Every student read some of the books.

b. No student read all the books.
The most interesting result of C’s study is the embedded free choice
effect in examples like (12). He shows experimentally that subjects judge (12)
to entail that every student is allowed to have an apple and also that every
student is allowed to have a banana.
(12) Every student is allowed to have an apple or a banana. (C: (12b))
Chemla’s observation is interesting because it shows a difference between

free choice effects and scalar implicatures, which are not as frequently lo-
cally computed in the same environment. This difference may bear on the
second and third of the questions in (2). Unfortunately, Chemla’s theoretical
discussion is limited to the account of Fox (2007) and on one point even
mistaken. Chemla compares the two versions of Fox’s (2007) proposal I
already mentioned above: either permitting embedded occurrences of Exh
or restricting Exh to one occurrence with clausal scope per utterance. The
non-deterministic former view predicts (12) to be ambiguous between the two
2:7
Uli Sauerland
representations in (13), while the latter globalist view permits only represen-
tation (13b)
(13) a. Exh Every student λx x is allowed to have an apple or a banana.

b. Exh Every student λx Exh x is allowed to have an apple or a
banana.
Chemla focuses on the fact that the Fox’s globalist view incorrectly pre-
dicts that (12) should be restricted to scenarios where not all students make
the same choice since (13a) entails that neither every student is allowed to
have an apple nor every student is allowed to have a banana. What Chemla
fails to note, though, is that the optionally local version of Fox’s proposal also
predicts (13a) as a possible reading for (12). In particular, neither does (13a)
entail (13b), nor vice-versa, and therefore both readings should be detectable.
But this doesn’t seem to be the case and therefore (12) is also a problem
for the non-deterministic version of Fox’s proposal, not just for the global
one. Since I argued above that both version are independently problematic,
Chemla’s new evidence just strengthens the point against both proposals.
The main conclusions I draw from Chemla’s paper concern a) the status of
free choice inferences, and b) the relation of embedded to global implicatures.
Chemla’s data only speak to my questions in (2) if we assume that free
choice inferences are indeed implicatures. Chemla’s data actually cast this
relationship in doubt. There is not that much empirical evidence in favor
of the relationship in the first place: the main direct piece of evidence in
favor of an implicature account of free choice inferences is the observation
by Kratzer & Shimoyama (2002) that the inferences disappear in the scope of
negation just like implicatures.8 However, C shows two differences between
free choice inferences and implicatures: First, only free choice inferences are
locally present in the scope of a universal quantifier as I already referenced
above. Second, C shows that negated modalized statements like (14a) don’t
trigger free choice inferences. Since (14b) is logically equivalent to (14a) the
absence of a free choice inference in (14b) shows that free-choice inferences
are not detachable in the sense of (Grice 1989). Usually implicatures are
detachable as, for example, Grice already discusses.
(14) a. John is allowed to not do A or not do B.

b. John is not required to do A and B. (Chemla 2009b: (15a))
8 Furthermore, free choice inferences can also be cancelled like other implicatures as in You
may have an apple or a banana, but I don’t know which.
2:8
C’s results are intuitively plausible and very interesting for the theory of
free-choice inferences. As far as I can see, there are two possible directions to
pursue. On the one hand, one could seek to treat free choice inferences not as
implicatures. Specifically, C’s result could be seen to support non-implicature
accounts of free choice such as Zimmermann (2000). On the other hand, it
may be that matrix free-choice effects are still implicatures, but embedded
free choice effects may be due to a special free-choice inference generating
operator. The latter position should be attractive to those who believe that
there are satisfying analysis of free-choice inferences as an implicature (Fox
2007; Schulz 2005).
3 Conclusions
In sum, the recent experimental work by G&P and C has confirmed the
views of those who have argued against the obligatory localism of Levinson
(2000) and Chierchia (2004), e.g. Geurts (2009), Russell (2006), and Sauerland
(2004b). Beyond that, the account of embedded implicatures and their
relation to global implicatures are still unclear. Solely testing for the presence
of embedded implicatures as G&P and C mostly do may be insufficient for
understanding embedded implicatures. Rather it may be more promising to
investigate wether the content of embedded implicatures is exactly the same
as that of implicatures at the matrix level.
In this direction, the difference C observes between embedded and matrix
implicatures is interesting and most likely helpful in sorting out the puzzle
of embedded implicatures. While I have no complete account to offer myself,
I close with some arguments to be skeptical of Geurts & Pouscoulous (2009b)
analysis of Chemla’s example (12): Geurts & Pouscoulous (2009b) suggest
accounting for (12) as an instance of an embedded speech act. This account, I
argue now is plausible for some cases, but most likely cannot cover all cases
of embedded implicatures: The possibility of embedded speech acts has
been acknowledged at least since Huddleston (1973) and embedded speech
acts can certainly be a source of embedded implicatures:9 (15) illustrates
that embedded speech acts must trigger embedded implicatures: the modal
particle wohl (‘well’) requires an embedded speech act interpretation for the
complement of glaubt (‘believes’) and furthermore triggers an inference that
9 The idea of a metalinguistic negation of Horn (1985) is closely related to the idea of embedded
implicatures, but more specific since it assumes a restriction to negation.
2:9
Uli Sauerland
the speaker also believes the complement clause.
(15) #Bill glaubt, dass einige der Kinder wohl krank sind. Aber alle
Bill believes that some of the children wohl sick are but all
Kinder sind krank.
children are sick.
However, this alone doesn’t predict correctly that (15) is odd. The oddness
of (15) is only predicted if there is also an embedded implicature. The
embedded implicature is the reason that the stronger belief, that some, but
not all children are sick, is attributed to the speaker. Then (15) is predicted
to be odd because the second sentence explicitly contradicts this attribution
of the embedded implicature to the speaker.
This example indicates that embedded speech acts trigger embedded
implicatures as all theories of speech acts would predict.10 However, I do
not believe that the reverse entailment also holds — that an embedded im-
plicature is always triggered by an embedded speech act. One problem for
this entailment is the following: Krifka (2001) argues that most does not
allow embedding of speech acts in its scope. Hence, (16) should not allow
embedded free choice inferences unlike (12). However, this doesn’t accord
with my intuitions: (16) suggests that most students can choose freely. For
instance, consider (16) in the following scenario: the majority of students
can freely choose between A and B and the few other students, who cannot
freely choose, must do option A. In this scenario (16) seems acceptable to
me, even though it may happen that not a single student chooses option B.
The acceptability of (15) in such a scenario is only expected if the free choice
inference is embedded in the scope of most.
(16) Most students are allowed to do A or B.
The embedded free choice inferences in (16) couldn’t be due to an em-

bedded speech act if Krifka’s (2001) is correct that most blocks embedded
speech acts. Therefore, (16) presents a problem for the proposal to derive
all embedded implicatures from embedding of speech acts. Some further
data that are problematic for the idea of deriving all embedded implicatures
from embedded speech acts are discussed by Sauerland (2004a). Therefore, I
conclude that contrary to Geurts & Pouscoulous’s (2009b) opinion, Chemla’s
10 This prediction, of course, arises to the extent that theories of speech acts permit embedding
of speech acts in the first place.
2:10
data in (12) are still in need of an account. And the search for such an
account may finally really lead us to a better understanding of embedded
implicatures.
References
Abusch, Dorit. 1994. The scope of indefinites. Natural Language Semantics

2(2). 83–135. doi:10.1007/BF01250400.
Bezuidenhout, Anne, Robin Morris & Cintia Widman. 2009. The DE-blocking
hypothesis: The role of grammar in scalar reasoning. In Sauerland &
Yatsushiro (2009), 124–144.
Büring, Daniel & Katharina Hartmann. 2001. The syntax and semantics of
focus-sensitive particles in German. Natural Language & Linguistic Theory
19(2). 229–281. doi:10.1023/A:1010653115493.
Chemla, Emmanuel. 2009a. An experimental approach to adverbial modifica-
tion. In Sauerland & Yatsushiro (2009), 249–263.
Chemla, Emmanuel. 2009b. Universal implicatures and free choice effects: Ex-
perimental data. Semantics and Pragmatics 2(2). 1–33. doi:10.3765/sp.2.2.
Chierchia, Gennaro. 2004. Scalar implicatures, polarity phenomena, and
the syntax/pragmatics interface. In Adriana Belletti (ed.), Structures and
beyond, 39–103. Oxford, UK: Oxford University Press.
Chierchia, Gennaro, Benjamin Spector & Danny Fox. 2008. The grammatical
view of scalar implicatures and the relationship between semantics and
pragmatics. Ms, to appear in Maienborn, Claudia, Klaus von Heusinger,
and Paul Portner (eds.) Handbook of Semantics.
Fox, Danny. 2007. Too many alternatives: Density, symmetry and other
predicaments. Semantics and Linguistic Theory 17. doi:1813/11295.
Geurts, Bart. 2009. Scalar implicature and local pragmatics. Mind and
Language 24(1). 51–79. doi:10.1111/j.1468-0017.2008.01353.x.
Geurts, Bart & Nausicaa Pouscoulous. 2009a. Embedded implicatures?!?
Semantics and Pragmatics 2(4). 1–34. doi:10.3765/sp.2.4.
Geurts, Bart & Nausicaa Pouscoulous. 2009b. Free choice for all: a re-
sponse to Emmanuel Chemla. Semantics and Pragmatics 2(5). 1–10.
doi:10.3765/sp.2.5.
Grice, Paul. 1989. Studies in the way of words. Cambridge, MA: Harvard
University Press.
Gualmini, Andrea, Sarah Hulsey, Valentine Hacquard & Danny Fox. 2008. The
2:11
Uli Sauerland
question-answer requirement for scope assignment. Natural Language

Semantics 16(3). 205–237. doi:10.1007/s11050-008-9029-z.
Horn, Laurence R. 1985. Metalinguistic negation and pragmatic ambiguity.
Language 61(1). 121–174.
Huddleston, Rodney. 1973. Embedded performatives. Linguistic Inquiry 4(4).
539–541.
Kratzer, Angelika & Junko Shimoyama. 2002. Indeterminate pronouns: The
view from Japanese. In Yukio Otsu (ed.), Proceedings of the Third Tokyo
Conference on Psycholinguistics, 1–25. Tokyo: Hituzi Syobo.
Krifka, Manfred. 2001. Quantifying into question acts. Natural Language
Semantics 9(1). 1–40. doi:10.1023/A:1017903702063.
Levinson, Stephen C. 2000. Presumptive meanings. Cambridge, Mass.: MIT
Press.
Meyer, Marie-Christine & Uli Sauerland. 2009. A pragmatic constraint on
ambiguity detection: A rejoinder to Büring and Hartmann and to Reis.
Natural Language & Linguistic Theory 27(1). 139–150. doi:10.1007/s11049-
008-9060-2.
Reis, Marga. 2005. On the syntax of so-called focus particles in German:
a reply to Büring and Hartmann 2001. Natural Language & Linguistic
Theory 23(2). 459–483. doi:10.1007/s11049-004-0766-5.
Russell, Ben. 2006. Against grammatical computation of scalar implicatures.
Journal of Semantics 23(4). 361–382. doi:10.1093/jos/ffl008.
Sauerland, Uli. 2004a. On embedded implicatures. Journal of Cognitive
Science 5(1). 107–137.
Sauerland, Uli. 2004b. Scalar implicatures in complex sentences. Linguistics
and Philosophy 27(3). 367–391. doi:10.1023/B:LING.0000023378.71748.db.
Sauerland, Uli. submitted. Disjunction and implicatures: Some notes on recent
developments. In Chungmin Lee (ed.), Proceedings of the CIL18 workshop
on contrastiveness in information structure and/or scalar implicatures.
Sauerland, Uli & Kazuko Yatsushiro (eds.). 2009. Semantics and pragmatics:
From experiment to theory. Basingstoke, UK: Palgrave Macmillan.
permission. Synthese 147(2). 343–377. doi:10.1007/s11229-005-1353-y.
Schwarz, Florian, Charles Jr. Clifton & Lyn Frazier. 2008. Strengthening ‘or’:
Effects of focus and downward entailing contexts on scalar implicatures.
To appear in UMOP 37.
Singh, Raj. 2008. On the interpretation of disjunction: Asymmetric, incre-
mental, and eager for inconsistency. Linguistics and Philosophy 31(2).
2:12
245–260. doi:10.1007/s10988-008-9038-x.
doi:10.1023/A:1011255819284.
Uli Sauerland
Zentrum für Allgemeine Sprachwissenschaft
Schützenstr. 18
D-10117 Berlin
uli@alum.mit.edu
2:13
doi: 10.3765/sp.3.8
Varieties of conventional implicature∗
Eric McCready
Aoyama Gakuin University

2010-05-08 / Final Version Received 2010-05-24 / Published 2010-07-29
Abstract
This paper provides a system capable of analyzing the combinatorics of a
wide range of conventionally implicated and expressive constructions in nat-
ural language via an extension of Potts’s (2005) LCI logic for supplementary
conventional implicatures. In particular, the system is capable of analyzing
objects of mixed conventionally implicated/expressive and at-issue type,
and objects with conventionally implicated or expressive meanings which
provide the main content of their utterances. The logic is applied to a range
of constructions and lexical items in several languages.
Keywords: conventional implicature, mixed content, type logic, resource sensitivity,

expressive content
1 Introduction
The nature of conventional implicatures has been under debate since their
existence was proposed by Grice (1975). Some philosophers deny that there
are such things at all (Bach 1999). In linguistic semantics, however, there
has been a recent surge of interest in their analysis, starting with the work
of Potts (2005). The work of Potts in this area has centered on conventional
implicatures that provide content which supplements the main, at-issue
content of the sentence in which they are used.
∗ Thanks to Daniel Gutzmann, Yurie Hara, Makoto Kanazawa, Stefan Kaufmann, Chris Potts,
Magdalena Schwager, Yasutada Sudo, Wataru Uegaki, Ede Zimmermann, and audiences at
NII, Kyoto University and the University of Göttingen for helpful discussion, and in particular
to three anonymous reviewers for Semantics and Pragmatics, as well as David Beaver and
Kai von Fintel, for extremely useful and insightful comments.
©2010 Eric McCready

Eric McCready
(1) a. John, a banker I know, played golf with Bernie yesterday.

b. Frankly speaking, I don’t know what you’re talking about.
Here, the content of the nominal appositive in (1a) and that of the speaker-
oriented adverbial in (1b) add content to the utterance, but in a way intuitively
independent of the claim the speaker intends to make by her utterance. No-
tice also that the appositive and adverbial only introduce conventionally
implicated content; they add nothing to the ‘at-issue’ content. This is charac-
teristic of all the elements studied by Potts.1
A number of authors (e.g. Bach 2006; Williamson 2009) have noted that
not all lexical items (or constructions) are associated exclusively with at-
issue content, or with conventionally implicated or expressive (CIE) content;2
instead, some expressions seem to introduce both. Pejoratives are the most
widely cited example. Williamson discusses an example from Dummett (1973),
the (extinct) pejorative Boche, which according to Williamson was in use in
Britain and France in the initial stages of WW1 in anti-German propaganda.
This choice is presumably made to avoid other expressions that are more
obviously offensive to the modern reader. However, the obsolete nature of
Boche makes it difficult to have clear intuitions about sentences in which it is
used. I will therefore make use of the pejorative Kraut instead, as an example
of a pejorative that, while still attested, is probably milder and less offensive
than some other possible choices.3 In any case, all instances of pejoratives in
this paper are data; they are mentioned, not used.
(2) He is a Kraut.
Pejoratives plainly introduce what I will call mixed content: they are pred-
icative of at-issue content, yet introduce a conventional implicature. I will
1 It is still possible that these expressions could be presuppositional in nature, rather than
part of a separate class of conventionally implicated meanings, as suggested by a reviewer.
I find the arguments of Potts on this issue (2005, 2007a, and 2007b) convincing, but I will
return to the point below.
2 ‘CIE content’ is intended as a neutral term for conventionally implicated and expressive con-
tent. In this paper, the assumption is made that, to a first approximation, both conventional
implicatures and expressives make use of roughly the same combinatoric system. Where the
distinction matters, I will not use the cover term.
3 I thank David Beaver and Kai von Fintel for helping me with the difficult choice of which
pejorative would have the desired qualities of being both relatively common and relatively
inoffensive. Hom (2008), faced with a similar decision, makes use of Chink, which is perhaps
fairly similar in quality.
8:2
Varieties of conventional implicature
provide a more detailed characterization of mixed content expressions in the

next section. Potts’s core logic is not able to handle examples of this sort
of mixed content (due to limitations imposed by the type system) without
additional, costly assumptions about semantic decomposition.4 The first
goal of this paper is to provide a system capable of analyzing mixed content
without such assumptions; formally this corresponds to an extension L+ CI
of Pott’s original (2005) system LCI , which is the most explicit theory of CIE
content presently available, and the one that is best understood. This is done
in section 2, where I also discuss and analyze other cases of mixed content.
The research of Potts and others suggests that conventionally implicated
content is supplementary by nature, a conclusion embodied in the original
system LCI , as will be shown in section 2.2. The cases of mixed content to
be discussed do not significantly alter this picture: although mixed content
elements introduce content in both at-issue and CIE dimensions, there is
clearly a sense in which the CIE content remains supplementary to the at-issue
content. LCI does allow for non-supplementary content of propositional type,
which is one way to view purely expressive single-expression utterances, as I
will show in section 3. In cases where combinatorics come into play, however,
only supplementary interpretations are available. The extended system L+ CI
enables nonsupplementary interpretations for reasons explained in detail in
sections 2.3 and 3.1: briefly, a new set of types turns out to be necessary for
the mixed content cases, given the combinatoric rules of LCI (which I will
argue should be kept intact). These cases cannot be analyzed in LCI at all.
Section 3, after discussing some instances of single-expression utterances,
argues that there are reasons that one might want the additional possibilities
given by these new types. Two main reasons are discussed: first, cases of
elements that are able to modify certain kinds of CIE elements but not others
(a possibility already disallowed by LCI ) and, second, cases of multiexpression
utterances that seem to lack at-issue content. The main cases focused on
are stand-alone particles and the Japanese adverbial yokumo (cf. McCready
2004). Section 4 presents an analysis of Quechua evidentials (following the
basic picture presented by Faller 2002) that treats parts of their content as
CIE; this analysis makes use of the full system introduced here.
The analyses proposed here, if correct, have substantial implications
for our understanding of CIE elements, and possibly for other semantic
elements as well. Section 5, the conclusion to the paper, discusses some of
4 Williamson comes to the same conclusion about pejorative items, indeed noting that Potts
must allow for mixed content to analyze them (his note 16).
8:3
Eric McCready
these implications, as well as summing up the paper and mentioning some

directions for future research.
2 Mixed Content
This section focuses on mixed content. I begin by providing criteria for

an expression introducing mixed content, in section 2.1, continuing with
a detailed look at the case of pejoratives in section 2.2. There it becomes
clear that there are two parts to the meaning of a pejorative expression:
an ‘ordinary’ predication of an individual as part of some group, and a
negative attitude expressed by the speaker with regard to that individual
by virtue of being part of that group. As has been argued in the literature,
this first content must be at-issue, while the second must be CIE. I review
these arguments and add some additional ones. Section 2.3 introduces
Potts’s (2005) logic LCI and shows that it has no way of producing single
lexical entries for linguistic objects that introduce mixed content. As I will
show, however, this does not mean that LCI has no way of analyzing such
expressions: it can decompose them into multiple morphemes at some level
of representation, some introducing at-issue content, and some CIE content.
I will evaluate this way of doing things in section 2.3 as well, concluding
that it is undesirable as a general method. 2.4 extends the logic to a system
that can analyze mixed content without decomposition: this is done by
allowing the construction of additional types via the recursive type definition,
and (crucially) introducing new combinatoric operations over these types.
The resulting system, L+ CI , is used to analyze pejoratives in 2.5. Section
2.6 examines other kinds of mixed content elements: formal and informal
pronouns, benefactive expressions, and certain honorifics, among others.
2.1 Mixed Content: Criteria
Before considering particular examples of expressions introducing what I will

be calling mixed content, it will be useful to first make it clear exactly what is
meant by this term.5 I will take an expression to introduce mixed content if
it fulfills the following two criteria.
First, it should introduce content in both at-issue and CIE dimensions.
The pejorative case above fills the bill: it is predicative, and so introduces
content in the at-issue dimension, but at the same time introduces an attitude
5 Thanks to several reviewers for suggesting that this exposition be made.
8:4
of the speaker toward some individual or group of individuals, which is CIE

content, as I will show in detail in the next section. Introducing content in
both dimensions is the essential criterion.
The second criterion is that it should be monomorphemic. Exactly what
counts as monomorphemic is, in part, a theory-dependent notion; the amount
of decomposition licensed by a particular theory will influence what can count
as introducing mixed content. For example, if one were to take pejoratives
like Kraut to introduce multiple morphemes at some level of semantic com-
position, then such pejoratives would no longer introduce mixed content, at
that level; rather, each bit of the word meaning would introduce unmixed
content of either purely at-issue or purely CIE type. This criterion means
that the first criterion is in fact strengthened: not only must at-issue and CIE
content both be introduced, but they must be introduced simultaneously, at
the same point in semantic composition.
A word about my own methodology. Here I will be working mostly
with a naive view of word structures which admits little to no semantic
decomposition. This will lead to taking certain expressions to introduce
mixed content which on other approaches might not do so; it will also
lead to a particular analysis which allows such introductions. I will discuss
some issues raised by this view, as well as alternate possible accounts, after
introducing the analysis itself. In any case, I believe that I will be able to
present some examples of mixed content that are monomorphemic on most
anyone’s view of the lexicon.
2.2 Pejoratives
Let us take as our main example of pejoratives Kraut, mentioned already

in the previous section. This choice is made for reasons of delicacy: many
current pejoratives sting quite a bit more than this one does, thanks to the
fact that it is (I believe) not used very much these days: in this sense it
resembles Boche, though it is much more current. This allows for a more
objective consideration. If the reader wishes to sharpen intuitions, she is
welcome to substitute her favorite pejorative; also, if she finds the particular
pejorative I have chosen excessively offensive, she is welcome to substitute
another one.6
Kraut is a pejorative term for German people on its nominal use. By
6 Again, just to make things absolutely clear, I have no attachment to the word Kraut, and I
would not want to be associated with the attitude it expresses.
8:5
Eric McCready
saying (2), repeated as (3), I assert that the referent of he is German, and
express that I have negative feelings about him.
(3) He is a Kraut.
Here, ‘Kraut’ obviously must contribute to at-issue content: if it does not, the
sentence cannot form a proposition, for the pejorative is the main predicate
of the sentence. The same can be seen when pejoratives serve as subjects.
(4) Every Kraut is not evil.
Here, the pejorative term is serving as the first argument to the determiner
(on a standard semantics). Pejoratives thus clearly form part of at-issue
content.
The expression of negative feeling that the word introduces, though, is not
part of at-issue content. This can be seen by considering the characteristics
of conventionally implicated and expressive content as discussed in Potts
2005 and Potts 2007a. Potts lists a number of properties that these kinds
of content are meant to have, some of which have been called into question
by various authors (e.g. Wang, Reese & McCready 2005; Wang, McCready &
Reese 2006; Geurts 2007; Amaral, Roberts & Smith 2008). In this paper I
will primarily consider two tests for conventional implicature/expressiveness
(CIEness). The first is scopelessness. The second is the behavior of CIE items
under denial.
CIE items, by definition, do not participate in at-issue semantic processes.7
In particular, they are not affected by semantic operators. Consider the
following examples.
(5) a. It is false that John, the swimmer, is a good dancer.

b. If John, the swimmer, comes to the party, everyone will have a
good time.
(6) a. That damn John didn’t come to the party.

b. If that damn John comes to the party, no one will have a good
time.
7 I do not consider here counterexamples to this claim which have been raised by Wang et al.
2005, 2006 and Amaral et al. 2008. These authors’ focus is on indefinite appositives in the
first case and on the interaction of attitude verbs and CIE content in the second. In my
discussion, I will use only examples that have not been controversial. I think it is clear that
the Potts generalizations about scope independence and denial hold for at least the areas of
CIE content and operators that I will be concerned with.
8:6
In these examples, it is clear that the content of the nominal appositives is not
affected by the negation or by the conditional, and similarly for the expressive
adjective damn. In this respect, CIE content is similar to presupposition.
It differs in that it cannot be bound (cf. van der Sandt 1992). ‘Binding’
refers to the situation in which a conditional antecedent (or other universal
construction) entails the content of a presupposition which appears in the
consequent. In this situation, no presupposition is projected.
(7) If John has a daughter, John’s daughter must be pretty.
Such binding does not happen for CIE content.
(8) a. # If John is a swimmer, then John, a swimmer, came to the party.

b. # If I hate John, then that damn John came to the party.
In these sentences, the content of the appositive, that John is a swimmer,

and of the expressive adjective, that John is in some way bad, is indeed
projected. The infelicity of the examples can be taken to follow from this
projection behavior: in (8a), for instance, since the speaker indicates that
John is a swimmer, it is odd conversational behavior to conditionalize over
this content, producing a sense of redundancy.
The second test relates to the first. CIE content does not participate in
denials. In ordinary denial, the truth of any at-issue part of a sentence can be
called into question. B’s denial in (9a) has the interpretations in (9b).8
(9) a. A: John came to the party last night.

B: That’s not true/That’s false.
b. ‘John didn’t come to the party.’
‘John didn’t come to the party last night.’
etc.
Consider what happens when one denies a sentence containing CIE content.
As the following examples show, the CIE content cannot be the target of
denial.
(10) a. A: John, a swimmer, came to the party last night.

b. ≠ ‘John is not a swimmer.’
8 Exactly which interpretation is selected will depend on focus, discourse topic, and other
aspects of information structure.
8:7
Eric McCready
(11) a. A: That damn John came to the party last night.

b. ≠ ‘There’s nothing wrong with John.’
Insofar as we take denial to be at least partly a semantic operation (cf. van

Leusen 2004), the result of this second test is a direct corollary of the first.
Now we can apply our first test to the cases of present concern: what
happens when one attempts to embed pejoratives? Clearly, the negative
attitudes they express are projected in that situation, so the content must at
the very least be presuppositional.
(12) a. He is not a Kraut.

b. He might be a Kraut.
c. Is he a Kraut?
However, if it is presuppositional we would expect that it can be ‘bound’ in

the usual way, so that if a conditional antecedent entails the non-assertive
content of Kraut this content will not be projected. In order to check whether
this is possible, we must determine what exactly the content of Kraut is.
Discussing Boche, Williamson takes the expressed content to be that the
individual picked out by the subject he is cruel, noting that it is not clear
that this really captures the non-asserted part of the meaning. Here he is
abstracting from Dummett, who writes ‘barbarous and more prone to cruelty
than other Europeans’ (Dummett 1973:454). I do not think Williamson’s
paraphrase is correct (indeed, he himself is not satisfied with it). It is certainly
not correct for the modern pejoratives that I know; while it may be correct
for Boche, it seems that pejoratives behave more or less alike in terms of
their basic meanings, differing only in the degree of approbation assigned to
the individual or group under discussion.9
Richard (2008) describes the expressive part of the content of pejoratives
as that an individual is bad by virtue of membership in a particular group;
in this case, the individual picked out by the pronoun is bad by virtue of
being a German. This is closer, but still cannot be correct; note that in the
examples in (12) there is no implication that the subject individual is bad in
9 I say ‘individual or group’ so as not to prejudge the issue. Also, it is possible that there may
be real differences between pejoratives in semantic terms, and that there may be different
semantic classes of pejoratives. These issues are larger than I can take on in the present
paper.
8:8
any way at all.10 Instead, what is expressed by the sentences in (12) is that
the speaker takes German people to be bad.11 Presumably the sense that the
subject individual is negatively characterized that Williamson picks up on is
derived via an inference: since it is asserted that he is German, and expressed
that German people are bad, it is also expressed, though indirectly, that he is
bad. But this does not seem to be a part of literal content, either at-issue or
CIE.
Supposing then that the expressed content of Kraut is roughly that
German people are bad, we can test its bindability via a conditional in the
usual way.
(13) If (I think) Germans are bad, then he is a Kraut.
This sentence is rather odd, in part because the expressed content of Kraut
does indeed appear to project from the conditional.12 On the assumption that
the proposed paraphrase is the right one, and generalizing from this case,
we can conclude that the expressed content of pejoratives is CIE rather than
presupposed. I will assume so in the following. It should be noted, however,
that the significance of the result of the binding test depends on the accuracy
of the paraphrase. If the paraphrase given is incorrect, or, even worse, if the
expressive portion of pejoratives is such that it does not admit a linguistic
paraphrase at all, then the test is invalidated. This is worrisome given the
analysis of Potts (2007a), according to which expressives have the property
of ‘ineffability,’ meaning that they literally cannot be paraphrased in ways
not involving other expressives.13 Even in this case, though, an expressive
paraphrase is possible:14
10 Unless one takes it to be a bad thing that one is not, or might be (etc.), a German; I will ignore
this notion in the following.
11 This may well be what Dummett had in mind.
12 A reviewer suggests that the oddity is due to the speaker apparently expressing uncertainty
about his own attitudes, which should be pragmatically inappropriate. However, even if the
speaker is an amnesiac who in fact does not know what his attitudes are (in some sense), the
oddity remains, suggesting that this is not the right explanation.
13 Geurts (2007) notes that something similar holds for other, non-expressive words like
green, though: they are not easily given satisfying paraphrases either. See also Fodor 2002.
However, the degree of difficulty seems to be different for the cases of green and (e.g.) damn.
A paraphrase of the latter cannot even be attempted without using expressives, whereas one
can (for instance) try to give exemplars of greenness for the former. I think Potts is right in
distinguishing the two types. I will have more to say about this issue in the conclusion.
14 A reviewer notes that the projection behavior may not be very surprising, given that we also
have expressive content in the antecedent, which has nothing to bind it. The fact that it
8:9
Eric McCready
(14) If I hate the {damn|fucking} Germans, then he is a Kraut.
Here, if one accepts the Richard analysis, the expressive content of ‘Kraut’ is
pretty clearly entailed15 by the content of ‘(I) hate the damn/fucking Germans.’
The conclusion is that this part of the content of Kraut is not presupposed,
which indicates that it is highly likely to be CIE content, given its other
behavior.16
Let us now consider the second test. What happens when one tries to
deny the content of a pejorative?
(15) a. A: Juan is a Kraut.

b. ≠ ‘German people are not bad.’
The result of this test also supports the conclusion that the negative part of
the meaning of Kraut, and, by extension, pejoratives in general is CIE content,
and not part of the at-issue meaning.
To sum up, we have reached the conclusion that pejoratives play a dual
semantic role: they act as ordinary nominals for predication or as arguments
of determiners, etc., but carry CIE content as well. They also appear to be
monomorphemic, at least in many cases. One might argue (as has Chris Potts,
p.c.) that in fact pejoratives are polymorphemic. An argument for such a
view comes from pejoratives like Jap, which could be viewed as composed of
is necessary to use expressives to paraphrase other expressives (given Potts’s ineffability
condition) may be one reason that binding of CIE content is impossible.
15 Or some expressive equivalent for the Potts 2007 system. Since according to that analysis
the function of (emotive) expressives is to narrow down a subinterval of R used as a model
of a range of emotion displayed with respect to some object, one can define a notion of
emotive entailment according to which P x emotively entails Qx iff the interval assigned to
x by P is a subset of that assigned to x by Q. Since I will not make use of this system in this
paper, I will not work out the details.
16 A reviewer suggests an analysis in terms of indexical presuppositions (Schlenker 2007), with
the following lexical entry:
(i) Krautc,w = λx : speaker(c) has a negative attitude toward German people in w(c).
German(x)
But this suggests (as far as I can see) that such a presupposition should be bindable in
examples like (ii).
(ii) If I have a negative attitude toward German people, then he is a Kraut.
Again, in amnesia contexts, this should be felicitous; and here the content certainly projects.
8:10
a root word (Japanese) and a truncating suffix with an expressive meaning. I

think this is at least reasonably plausible for cases like Jap, but certainly not
for all pejoratives. Expressions like Frog, Yankee17 or the Japanese sangokujin
‘third country person’ — or indeed Kraut — pretty clearly lack a truncation
of the relevant type. At the very least, it is not clear that all pejoratives
contain multiple morphemes. Since the proposed conditions are met, then,
they introduce mixed content.18 In the next section, I will introduce the
compositional system of Potts 2005, which was designed for the analysis of
conventional implicature; as we will see, it is not, as it stands, able to analyze
mixed content qua mixed content. But this is not the end of the story yet.
2.3 LCI
Potts (2005) proposes a pair of logics called LCI and LU for the analysis of
conventional implicature.19 These two logics interact in sometimes complex
ways. The parts of the system that concern us here involve a) what kinds
of expressions are semantically well-formed, b) how these expressions are
combined in the logical syntax, and c) how the resulting expressions are
interpreted. These issues all relate to LCI , which is a higher-order lambda
calculus. The first corresponds to a definition of admissible types in LCI
and the second to rules for how the admissible types are combined. The
third issue corresponds to a rule for the interpretation of conventionally
implicated expressions: effectively a mapping between expressions of LCI ,
the type theory used for the combinatorics, to logical forms intended for
model-theoretic evaluation. I examine each in turn. As we will see, the system
as set up in Potts’s work cannot be used to model the behavior of mixed
content expressions, which will prompt modifications to it in section 2.3.
First, the types themselves. Potts defines a system of types. Here, as in
17 As in ‘Yankee Go Home’ — I make no claims about the historical development of the term.
18 It is still debatable whether the precise content I have proposed (following Richard) is right.
Hom (2008) gives an interesting analysis in which pejorative content is not expressive at
all, but instead is a social construct varying across speaker groups. I will not argue in
detail against this proposal here — I am sympathetic to the notion of social construction
of meaning, at least in these sorts of cases — but I doubt that all the content of pejoratives
is truth-conditional. Hom considers and rejects the sort of evidence (denials and operator
scope arguments) I have made use of here. In my opinion he is too hasty in doing so, but
fully responding to his arguments would take us too far afield.
19 I will not review the full motivations for these logics here, or all the details of how they work.
I will focus only on the parts that will be necessary for the proposal in this paper.
8:11
Eric McCready
the type theories standardly used in linguistic semantics (cf. Heim & Kratzer
1998), basic types are e, t, s, which are used to produce an infinite set of
types via the usual kind of recursive definition. (The details of the definition
are provided in Appendix A.) However, Potts’s logic differs in that it makes
crucial use of a distinction between at-issue types and CI types (‘CI’ indicating
conventional implicature). The distinction is indicated via a superscript ‘a’
or ‘c’ on the type name. At-issue types are freely produced in the usual way.
CI types are distinct: they are always of the form hσ a , τ c i, functions taking
at-issue typed objects as input and outputting CI-typed objects. There is no
mechanism for producing types that take CI-typed objects as input. This,
according to Potts, is the reason that conventionally implicated content is
independent of at-issue operators: there simply are no operators over CI
content.
How are these objects combined? LCI has the derivation rules for type
combination shown in Figure 1. Potts couches them as ‘tree admissibility
conditions’ but this comes out to more or less the same thing as a derivation
rule if one understands his trees as proof trees: the Table 1 notation is more
compact, so I will use it in what follows. As far as I am concerned this is a
notational variant. It should, however, be noted that the logic behaves in ways
that are odd from the standpoint of many logics familiar to linguistics such
as categorial grammar; notably, unlike the categorial grammars implemented
for standard at-issue semantic combination, it is not resource sensitive for
CI types, as detailed below. The essential point is that a resource sensitive
logic is one that consumes resources as they are used in proofs. This is a
property of the combinatorics of at-issue content: combining sleeps with
John yields sleeps(john), but the meanings of noun and verb are consumed
and no longer available for further composition. As we will see, this is a
property that LCI rightly lacks.
The rules in Figure 1 are meant to model the combinatorics in conjunction
with a syntactic structure, just as in the work of Potts, meaning that they
should retain the constituency-driven character of the original LCI rules.20
(R1) is just a reflexivity axiom. (R2) is ordinary application for at-issue
20 I also diverge from Potts on my treatment of CI propositions introduced low in a tree.
In Potts’s formulation, the possible presence of such additional CI conditions warrant
sometimes thinking of these rules as shorthand for a larger rule set. See Potts 2005: 222 for
details. Instead of this route I will consistently make use of R5 to eliminate all elements of
type t c from derivations immediately after they are derived, which means that there will not
be extra free-floating CI content. Thanks to a reviewer for inspiring this strategy.
8:12
α:σ
(R1)
α:σ
α : hσ a , τ a i, β : σ a
(R2)
α(β) : τ a
α : hσ a , τ a i, β : hσ a , τ a i
(R3)
λX.α(X) ∧ β(X) : hσ a , τ a i
α : hσ a , τ c i, β : σ a
(R4)
β : σ a • α(β) : τ c
β : τ a • α : tc
(R5)
β : τa
α:σ
(R6) (where β is a designated feature term)
β(α) : τ
Figure 1 Rules of proof in LCI .
8:13
Eric McCready
elements; this is completely standard in formal semantics. (R3) is a rule for

intersection, where we abstract over the input type of two elements. (R4) and
(R5) are the rules mainly of interest to us. Given an expression of a given at-
issue type and another expression mapping that type to some conventionally
implicated type, use of (R4) yields the resulting conventional implicature
paired with the original at-issue type, where the ‘•’ operator (henceforth
referred to as ‘bullet’) simply indicates this pairing. The bullet is used only to
conjoin at-issue and CI type objects. This means that any given node in the
proof tree can be decorated with both at-issue and conventionally implicated
content.21 (R5) strips CI objects of propositional type away from a premise
set (by shunting them away to another meaning dimension, as we will see
shortly). What is absolutely crucial in rule (R4) is that the at-issue content is
duplicated in the output of the derivation. This means that the logic allows,
indeed requires, duplication of resources, when conventional implicatures are
involved. Given that LCI is designed for the interpretation of supplementary
elements like appositives and (some) speaker-oriented adverbials, this makes
perfect sense. This observation, though, highlights a difference with standard
categorial logics: since such logics are meant exclusively to model at-issue
semantic composition (via the Curry-Howard isomorphism, cf. Carpenter
1998; Sørensen & Urzyczyn 2006), they are always resource-sensitive. This
difference can be taken as a significant generalization about supplementary
CI(E)s. The final rule, (R6), allows introduction of content via ‘designated
features’; such features can be associated with constructions, as in the case
of appositives, or (in principle at least) with lexical items.
After the semantic computation is complete, the proof tree itself is then
interpreted as a semantic object via the following rule.22
21 •-terms have some affinities to the dot objects of Pustejovsky (1995), and not only in form. I
will say a bit more about this in footnote 29.
22 As noted by Chris Potts (p.c.), this rule is potentially odd from the perspective of proof
interpretation. In proofs, objects of type t are often introduced in the course of logical
derivations but left out of truth evaluation (e.g. in the context of a conditional proof); (16)
has such objects contributing to evaluation just in case they are of type t c . This is just to
say that it is necessary to collect type t c objects from the entire proof, so in a sense the
proof becomes a first class citizen of the interpretation mechanism and not merely a means
for deriving a sentential interpretation. This may well be out of line with what is commonly
assumed in e.g. the literature on direct interpretation (see Barker & Jacobson 2007). For this
reason Potts uses derivation trees, which he takes to be a necessary intermediate step in
interpretation of CI elements, a point stressed in both Potts 2005 and Amaral et al. 2008. He
suggests that my use of proof trees here is misleading.
He may be right, but I do not think the problem is so serious. In essence, defining a rule
8:14
(16) Proof tree interpretation (after Potts). Let T be a proof tree with at-
issue term α : σ a on its root node, and distinct terms β1 : t c , . . . , βn : t c
on nodes in it. Then the interpretation of T is hα : σ a , {β1 :
t c , . . . ,βn : t c }i.
Here α and β are variables over lambda terms, and σ a is a variable over
semantic types. The superscripts distinguish the types as either at-issue
(superscript a) or CI (superscript c). Effectively, conventionally implicated
content is shunted into a separate dimension of meaning. The bullet therefore
functions as a bookkeeping device in the proof.
The action of these three elements of the Potts logic, then, is as follows.
First, types for conventional implicature are defined; crucially, there are no
types that take conventionally implicated content as input. Second, these
types are combined via the rules in (R1-6). With respect to conventional
implicatures, this means the effect is to isolate conventionally implicated
content from at-issue content with a bullet, by rules (R4) and (R5). •-terms
are then separated into separate dimensions of meaning, by the schema in
(16).
Let us consider how this logic can be used for the analysis of mixed
content objects. It is easy to see that it cannot be so used in its current
form, given the assumption that the at-issue and CIE content are introduced
by the lexical item simultaneously. The type construction rules (again, see
Appendix A for details) provide for types of the form hσ , τia , purely at-issue
types, and hσ , τic , purely CI types. Intuitively, in the case of pejoratives
we require an object with the type of an ordinary predicate in the at-issue
dimension, and one of propositional type which is CIE.23 What we need is
a typing for objects that are of mixed type, but this cannot be produced in
LCI . As far as I can see, the only way to model mixed content in LCI would
be to assume that content can be introduced in two distinct stages. This
on semantic derivation trees and semantic derivation proofs should yield the same results,
given that the mechanisms of derivation are equivalent. I do not see a substantial difference
in giving derivation trees citizen status and giving the same kind of status to proof trees. In
any case, the proof-based rule is less odd in the context of derivations proceeding in concert
with a syntax, and problems that could arise with e.g. λ-abstraction will not arise in the
context of CIE content, where (as far as is known presently) abstraction does not occur. Still,
if the reader feels happier with using trees, she is welcome to perform the translation, which
is technically trivial.
23 If one follows e.g. Williamson and takes pejoratives to introduce predicates in the CI
dimension as well, the situation changes somewhat, but the basic problem is the same. We
will see cases of this type in section 2.6.
8:15
Eric McCready
idea can be implemented by assuming that pejoratives introduce an at-issue

object, which is then predicated in some way by a CI object via R4. The result
will be a CI proposition and an at-issue predicate. In the case of Kraut, we
would have the following. (Here ‘∩ ’ is the kind formation operator used by
e.g. Chierchia 1998.)
λx. German(x) : he, tia λP . bad(∩ P ) : hhe, tia , t c i

R4
λx. German(x) : he, ti • bad(∩ German) : t c
a
This is the desired logical form. But this kind of approach requires
allowing mixed content objects to separately introduce multiple pieces of
content. This analysis seems to destroy the intuition that pejoratives and
other instances of mixed content are singular semantic objects with a dual
character. It indeed strikes me as highly unnatural to have a lexical entry
realized in terms of multiple, fully separate entities.24,25 I therefore take it to
be truer to intuitions to modify the logic in such a way that mixed content
can be modelled directly. This is done in the following section.
2.4 L+
CI
This section of the paper proposes L+ CI , an extension of LCI that can handle
+S
mixed content. In the process, we will also define a sublogic of L+ CI , LCI ,
which introduces a set of types for CIE objects that have resource-sensitive
properties.
The first necessary step involves adding resource-sensitive CIE types to
LCI . The reason is that there are mixed content items which are predicative
in both dimensions. Pejoratives introduce mixed content: but only part of
this content, the at-issue portion, is predicative (or so I have argued). The
CIE content is propositional. Because it is propositional, there is no special
24 The case of presupposition may seem formally similar on a superficial level, but it is rather
different in that presuppositions (on some perspectives at least) simply indicate definedness
conditions for the at-issue content, whereas here the two bits of content are entirely separate
and represent fully distinct discourse contributions.
25 Note also that the proposed analysis is different from analyzing single lexical items as
consisting of a single complex condition; the two types of decomposition are entirely
different in quality. Assigning a word a meaning of the form λx[P (x) ∧ Q(x)] seems rather
different from giving it a pair of meanings λx[P (x)] and λx[Q(x)] which are meant to
apply to the input at different points in the derivation. The latter seems appropriate in only
special situations, e.g. when a word makes two distinct contributions that can be traced
back to specific distinct parts of the word. We will return to such examples in section 2.6,
where I will discuss the general merits of the decompositional strategy.
8:16
need for resource-sensitive types here; but in cases where there is a dual
predication, a lack of resource sensitivity will cause serious problems in the
meaning composition, as I will detail shortly. It is not hard to find cases
of mixed content where both the at-issue content and the CIE content are
predicative. An instance can be found in the Japanese honorific system.
Certain honorifics in Japanese come with special morphology which clearly
carries the honorific load; these sorts of expressions are analyzed by Potts &
Kawahara (2004) as introducing a kind of expressive content. In such cases,
it is easily possible to analyze the morphemes as introducing supplementary
expressive content exclusively. However, there are other lexical items which
simultaneously honor some individual and predicate something of her. An
example is irassharu ‘come[Hon]’.
(17) sensei-ga irasshaimasi-ta

teacher-Nom came.Hon-Pst
‘The teacher came’ (the teacher is being honored)
Here, the verb simultaneously says of the teacher that she came, and indicates
that she is deserving of honor.26 This verb satisfies both the criteria for
mixed content: it introduces both an at-issue predication and expresses
honorification at the CIE level.27 Further, the verb is (at the surface at least)
monomorphemic. It cannot be separated into morphemes introducing at-
issue and expressive content separately, unlike (for instance) the honorifics
studied by Potts & Kawahara (2004), which clearly contain morphemes which
separately provide honorific meanings. This does not of course preclude a
decompositional analysis, on which more below. But, barring independent
(synchronic) reasons for such an analysis, it seems desirable to analyze this
expression as simultaneously introducing two types of meaning, and so as a
bearer of mixed content.
The upshot is that honorifics like irassharu are instances of mixed content
which are predicative in both dimensions of meaning. How could such exam-
ples be analyzed in LCI ? Note what will happen if we make the obvious move,
and analyze this expression as involving an object of at-issue predicative
type, and a CIE object of similar type, conjoined by a bullet as usual:
26 Or however one wishes to paraphrase the honorific relation; I will not address this question
here in detail. See section 2.4 for some brief discussion.
27 For arguments that honorific content is expressive, see Potts 2005, Potts & Kawahara 2004,
and Kim & Sells 2007.
8:17
Eric McCready
(18) irassharu= λx. come(x) : he, tia • λx. honor(s, x) : he, tic
Applying this object to the referent of sensei ‘the teacher’ (which I will treat
as a referring expression for simplicity) yields the following by R4, or would
if R4 was defined for expressions conjoined by the bullet operator, which it
actually is not. If we wanted to extend R4 to cases of •-conjoined objects, we
would actually need to define a new rule. Let us see what such a rule would
be for purposes of discussion. This rule simply assumes that we perform
pointwise application of every element conjoined by a bullet according to the
proper rules, which will be R2 for the at-issue side of the bullet and R4 for
the CIE side. The use of R4 of course means that the content of the input to
the CI type will be duplicated in the output, yielding the results of the two
applications, and an unmodified input as well.
α : hσ a , ρ a i • β : hσ a , τ c i γ : σa
(19)
α(γ) : ρ a • β(γ) : τ c • γ : σ a
With this rule we can attempt a derivation of (17), which will go as follows.
λx. come(x) : he, tia • λx. honor(s, x) : he, tic t : ea

come(t) : t a • honor(s, t) : t c • t : ea
Since the CIE content is not, by R4, resource-sensitive, the predication by

the right conjunct of the • in the premises will yield the result of the ap-
plication, as desired, but also will return the original at-issue input to the
functional application. But this is undesirable: the result is not semantically
interpretable. In Potts’s work, where CIE expressions are restricted to those
introducing supplementary content, the CI types were required to have a
resource-insensitive nature. But, as we can see, in cases of mixed content
it yields the wrong results. We therefore need to add a new sort of content
which is both CIE and resource-sensitive.
The result of adding types for resource-sensitive CIE content to LCI is
called L+S
CI . I will use a superscript s to distinguish what I will call shunting
types, types for those semantic objects that ‘shunt’ information from one
dimension to another, without leaving anything behind for further modifi-
cation. The type system obtained by adding these types to LCI is defined in
Appendix B.1. With this type classification, it becomes possible to define a
rule specific to nonsupplementary conventional implicatures.
α : hσ a , τ s i, β : σ a
(R7)
α(β) : τ s
8:18
We can then modify the rule in (16) to handle information from shunting
types as well. σ {x,y} indicates that σ is a type of sort x or sort y.28 We will
see a number of examples of the application of this rule in what follows.
(20) Generalized Interpretation (first attempt). Let T be a proof tree

with at-issue term α : σ a on its root node, and distinct terms β1 :
t {c,s} , . . . , βn : t {c,s} on nodes in it. Then the interpretation of T is
hα : σ a , {β1 : t {c,s} , . . . ,βn : t {c,s} }i.
The combination of (R7) and the new interpretation rule in (20) serves to main-
tain the original generalizations about supplementary meanings provided by
LCI while expanding the system’s coverage to conventional implicatures that
introduce the primary meaning of the sentence they appear in. In section 3, I
will show that the possibilities made available by the existence of these types
are exploited by natural language, even outside the domain of mixed content.
The resources to create the needed kind of objects to model mixed content
are obviously already present in L+S
CI . We already have what we need: at-issue
types and CI types. We need only a way to produce product types across the
two dimensions, and then an application rule telling us what to do with such
types when we have them. I will now provide these tools; the resulting type
system is called L+CI .
It is rather simple to add the relevant types. We need only a single
typing rule producing mixed types. This rule is provided in Appendix B.2. It
produces types of the following form:
hσ , τia × hζ, υis
This object is a product type where the conjoined types are an at-issue
type and a shunting type.29 Note that the input to the at-issue type and
the shunting type need not be of the same semantic type; this means that
it is in principle possible that the situation arises where the two will have
incompatible inputs. Such typings will not work in composition though, as
28 I thank Yasutada Sudo for helping me to correct an infelicity in an earlier version of this
definition.
29 These objects are rather similar to the dot objects of Pustejovsky (1995), as already mentioned
in footnote 21. The difference is that, in Generative Lexicon theory, trying to make use of
both ‘sides’ of the dot object generally results in zeugmatic infelicity as in (i), so there is no
rule like (R8) even in the extended system (Asher & Pustejovsky 2005).
(i) ?? John hung a poster on and walked through the door.
8:19
Eric McCready
they will not be interpreted by any rule, which will rule them out in practice.
Mixed types like these are paired with λ-terms of the form α _ β: ‘_’ (hereafter
‘diamond’) signifies a semantic object of mixed type. We now need rules for
interpreting these types. I propose the following two.
α _ β : hσ a , τ a i × hσ a , υs i, γ : σ a
(R8)
α(γ) _ β(γ) : τ a × υs
Given as input a mixed type and an object of the at-issue type that is input
to both conjoined elements in the mixed type, (R8) outputs the result of
applying each element of the mixed type to the input, where both objects are
conjoined with ‘_’ as before. An example of this is precisely the derivation of
mixed content terms, where both CIE content and at-issue content look for
objects of the same type as input; we will see many examples in the coming
sections. We will need one further rule telling us what to do with mixed terms
when the CIE part of the derivation is complete: this is provided as R9.
α _ β : σ a × ts
(R9)
α : σ a • β : ts
This rule instructs us to replace mixed type terms involving the conjunction
‘_’ with terms conjoined by a ‘•’ when the CIE object is propositional (of type
t). Roughly, we have a change in bookkeeping device corresponding to a
change in typing: the diamond indicates that the two terms it conjoins are
still ‘active’ in the derivation, but the bullet indicates that the CIE side has
already gotten all its arguments and is ready for interpretation. R9 thus,
in a sense, moves shunting-typed terms out of active use. Doing so allows
for interpretation via the rule in (20). Again, we will see examples in the
following sections.
At this point, it is possible to abstract away from the honorific example
provided earlier to make clear the general need to use shunting types on the
CI side of the mixed type. Recall that the CI types in LCI are not resource
sensitive; they always return their at-issue input as well as the result of
applying the CI type to this input. (R4) yields an object of the type σ a • τ c
when an functional CI type hσ a , τ c i is applied to something of type σ a . But
this means that, if we use CI types, then in the terms typed as α(γ) _ β(γ) :
τ a × υc yielded by a variant of (R8) which uses CI types, the object to the
right of the diamond will be of the form γ : σ a • β(γ) : υc itself due to (R4),
as we have seen. This means that the result of the application is of the
form α(γ) _ γ : σ a • β(γ) : υc .’ We have seen an instance of this with the
8:20
attempted (and failed) derivation of (17) above. This means that there is an
‘unused’ term of type σ a floating around in the derivation, which will result
in ill-formedness. We do not want this, and we can avoid it by using shunting
types on the right-hand side instead. Such types remove the terms they apply
to from the at-issue dimension completely, which clearly is what is needed in
this case.30
With this rule and the type system in Appendix B.2, we are able to provide
an adequate semantics for lexical items that introduce simultaneously at-
issue and conventionally implicated content, by defining objects of mixed
at-issue and CI types.31 The next section shows in detail how this can be done
for pejoratives, and the following section, 2.6, how it applies to other parts
of natural language in which we find mixed content.
2.5 Analyzing Pejoratives
It is straightforward to give an analysis of pejoratives in L+

CI . Recall that we
needed a way to provide at-issue content and CIE content in a single lexical
entry. We now have the means to do so. We need only make use of the mixed
types defined in the previous section. As discussed in section 2.2, I will take
the at-issue content of pejoratives to be predicative, and the CIE content to
be propositional. We end up with the following kind of lexical entry: again, I
use Kraut as a representative example.
(21) Kraut= λx. German(x) _ bad(∩ German) : he, tia × t s
The composition will work as follows.
30 If one takes the intuitive interpretation of shunting types to be ‘main conventionally impli-
cated content,’ then the definition of mixed types indicates that there are two kinds of ‘main
content’ in mixed-type sentences. I myself do not find this very counterintuitive.
31 A reviewer asks whether we need CI types at all anymore, given the new system. The
suggestion is that one could make all types for CIE objects use the format of mixed types,
but just provide a tautological component on the at-issue side, for instance the identity
λX.X for polymorphic types. I do not see any technical reason this could not be done,
though there might be reasons one would want to make a clear distinction between mixed
and unmixed types in the type system. In any case, the comment shows that L+ CI is in fact a
genuine extension of LCI . Thanks to the reviewer for picking up on this point.
8:21
Eric McCready
(22) a. Juan is a Kraut.

b.
λx. German(x) _ bad(∩ German) : he, tia × t s
R9
λx. German(x) : he, tia • bad(∩ German) : t s
R5
j : ea λx. German(x) : he, tia
R2
German(j) : t a
Given the rule (20), this will yield
hGerman(j), {bad(∩ German)}i
as its interpretation, which will be evaluated as usual in the Potts system.

Roughly, the sentence will be true iff Juan is a German, and expressively
appropriate if the speaker feels that Germans are bad. Use of (22a) intuitively
indicates that the speaker thinks that Juan is bad himself; I showed in 2.2 that
this is not a part of the CIE content of the sentence (via embedding tests),
but one can see why it follows in this system. Since the speaker asserts that
Juan is German, and expresses a negative attitude toward German people in
general, it is natural to conclude that the speaker holds a negative attitude
toward Juan as well. It is also natural to conclude that the speaker intends,
as part of the reason for his utterance, to indicate this attitude. The content
that Juan is bad, then, is communicated, probably intentionally, but is not,
strictly speaking, a part of the semantic content of the sentence.32
2.6 Other Mixed Elements
It is easy to find examples of mixed content in the languages of the world.

It suffices to consider the characteristics of mixed expressions. They are
32 A reviewer questions the analysis on the basis of examples like (i) and (ii).
(i) He’s German but at least he’s not a Kraut.
(ii) He’s a Boche but at least he isn’t a Kraut as well.
The reviewer finds these grammatical and suggests that they are problematic, because
only the CIE content distinguishes the two categories in each case. This is an interesting
observation, but speakers I have consulted (including myself) find the examples infelicitous.
I myself feel they are contradictory, especially (ii). I therefore will not modify the theory
to address them. But one suggestion might be that, for those that find such examples
OK, there is some content present in the pejoratives in addition to the CIE content which
distinguishes the two properties; perhaps it is even the case that some of the CIE content
has been reanalyzed as at-issue. I will not speculate further.
8:22
associated with conventional implicatures, but, since they also denote at-
issue content, they can serve as main predicates and are affected (in part)
by various semantic operators. It does not seem at all difficult to find such
expressions; in fact, many examples are noted in the literature. Let us begin
by returning to the Japanese mixed content honorifics discussed in section
2.3. There I discussed the honorific irassharu, which has the at-issue content
of an ordinary motion verb and the CIE content that the speaker honors the
individual denoted by the sentential subject. In L+CI , this can easily be given
an analysis. 33
(23) irassharu= λx. come(x) _ λx. honor(s, x) : he, tia × he, tis
Given this lexical entry, we can see that the honorific will participate in
composition in much the same way that (predicative instances of) pejoratives
do. The difference will, of course, be that predication takes place in both
at-issue and CIE dimensions. An example is the following.
(24) a. Yamada-sensei-ga irasshaimasi-ta

Y-teacher-Nom came.Hon-Pst
‘Teacher Yamada came. (and I honor him)’
b.
ty : ea λx. come(x) _ λx. honor(s, x) : he, tia × he, tis
R8
came(ty ) _ honor(s, ty) : t a × t s
R9
came(ty) : t a • honor(s, ty) : t s
Other examples of this type include meshiagaru ‘eat.Hon’ and goranninaru

‘see.Hon’, which will receive an analysis similar (in terms of typing) to the
above irassharu, except that they will take two arguments, as the verbs are
transitive.
33 It is worth asking what the behavior of expressions like these is with respect to the tests
proposed by Potts, Alonso-Ovalle, Asudeh, Bhatt, Cable, Davis, Hara, Kratzer, McCready,
Roeper & Walkow (2009). These authors argue that expressive content does not participate
in a number of grammatical operations that intuitively involve identity, such as anaphora.
Indeed, the behavior of irassharu is as expected given this test.
(i) Sensei-ga irasshaimasita. Ano kojiki mo soo-shita

teacher-Nom came.Hon. that bum also so-did
‘The teacher came. (The teacher is honored.) That bum did too.’
No inconsistency is felt here, despite the epithet in the second sentence; and the second
person who came is not honored, consistent with the conclusions of the squib.
8:23
Eric McCready
We can now consider the details of what one would have to do to analyze
these examples with only the type resources of at-issue and CI types. This
makes the need for shunting types even more obvious than before. I can
see two ways to allow for this in principle in LCI , only one of which involves
modifying the logic at all. The first, as with the propositional part of pejora-
tive meanings, involves letting mixed content elements introduce separate
pieces of content. Then we could simply stipulate that CI application takes
place before at-issue application, yielding a two-step composition process
for mixed type objects. This ordering must be introduced to exploit the
non-resource-sensitivity of CI types. We would get roughly the following,
supposing that both at-issue and CI content is of type he, ti.
a : ea λx.P x : he, tic

R4
a : ea • P a : t c
R5 λx.Qx : he, tia
a : ea
R2
Qa : t a
which in turn yields the meaning hQa, {P a}i by the interpretation rule in
(16). Effectively, this idea amounts to analyzing mixed content terms as two
completely separate lexical objects, one at-issue and one CI, as can be seen
from the fact that in the semantic derivation this application would have
to take place on two distinct nodes. Notice also that the two parts of the
content must be separated in the combinatorics for things to work out. I take
it that this option is entirely undesirable, just as in the case of pejoratives.
However, there may be arguments for this style of analysis in certain cases; I
will discuss some below, and also evaluate the whole style of this approach
as a possibility for the general analysis of mixed content bearers.
A second option would be to add a new composition rule to LCI and add a
means of producing mixed types, but not to introduce shunting types, instead
making use of only the standard Pottsian CI types, σ c .34 Together with this,
we would require a composition rule for ‘mixed bullet types,’ necessary in
order to avoid the unwanted duplication of content that would result from
allowing the application of R4, as discussed in section 2.2. This rule would
have to look roughly like the following. This can be viewed as an attempt
to solve the problems introduced by the rule (19), which of course caused
difficulties stemming from lack of resource sensitivity.
34 The rule for producing such types is the obvious analogue of B.2.1.i in which ‘•’ is substituted
for ‘_’ and all instances of shunting types are replaced with CI types.
8:24
(25)
α • β : hσ a , τ a i × hσ a , υc i γ : σa
α(γ) : τ a • β(γ) : υc
The result of (25) is to allow application to occur in • types, but without

duplication of content. This is just what is required for cases of mixed
content. However, it comes with obvious problems. Its function is precisely
to make R4 not apply in the relevant cases. But this has bad consequences
for the typing system: it becomes inconsistent in the sense that the behavior
of types is now situation-specific. One might even wonder if objects behaving
in this way are types in the usual sense at all. Further, consider one major
purpose of allowing CI types in the first place in LCI . This was to model
the work done by supplementary CI content, which always seems to show
non-resource-sensitive behavior. If we allow for rules like (25), this behavior
is no longer a direct consequence of the system. Concretely, suppose that,
unlike the instances of supplementary content discovered so far, instances
of supplementary content that take more than one argument are discovered,
but which are still resource-insensitive. In such circumstances, conflicts may
develop between R4 and (25), which the type system would have no way to
resolve without use of ad hoc constraints external to the formal system. All
these problems are avoided by the use of shunting types.
It is not hard to find other examples of mixed content in recent work
in the semantics-pragmatics literature. Kubota & Uegaki (2009) analyze the
Japanese benefactive, which simultaneously indicates that the subject has
caused the dative argument to do some action and conventionally implicates
that the action was beneficial for the nominative argument.35
(26) Taroo-ga Hanako-ni piano-o hii-te morat-ta.

Taro-Nom Hanako-Dat piano-Acc play Benef-Pst
at-issue: ‘Taro made Hanako play the piano.’
CI: ‘Hanako’s playing the piano was beneficial to Taro.’ (K&U; their
glosses)
The crucial point here is that the benefactive introduces both a causative
at-issue meaning and a conventional implicature to the effect that the caused
event benefited the causer. Again, this expression satisfies both criteria for
mixed content bearing: it is both monomorphemic and introduces content
along two dimensions. This is plainly an instance of mixed content.
35 I follow Kubota and Uegaki’s glosses and morphological analysis.
8:25
Eric McCready
In our system L+ CI , we can represent the benefactive morau

36
with the
semantics in (27a), which is of the type in (27b):
(27) a. λP λxλy. cause(y, P (x))_λP λxλy. good(y, P (x))

b. hhe, ti, he, he, tiiia × hhe, ti, he, he, tiiis
This lexical entry is of mixed type; derivations with it will proceed via the
rules (R8), for the combinatoric steps, and (R9), for the final step which shifts
the mixed content to something interpretable via (20). Here is the derivation,
with types and rules of proof only.37
piano : ea hiite : he, he, tiia

R2 moratta : hhe, ti, he, he, tiiia × hhe, ti, he, he, tiiis
hiite(piano) : he, tia
h : ea
R8
moratta(hiite(piano)) : he, he, tiia × he, he, tiis
ea
R8
t: moratta(hiite(piano))(h) : he, tia × he, tis
R8
moratta(hiite(piano))(h)(t) : t a × t s
R9
π1 (moratta(hiite(piano))(h)(t)) : t a • π2 (moratta(hiite(piano))(h)(t)) : t s
Formal and informal pronouns in various European languages such as

tu/vous in French or tu/usted in Spanish also carry mixed content, as dis-
cussed by Horn (2007). These objects carry the conventional implicature that
the speaker feels (as if he should be) formal (informal) toward the addressee,
while having the at-issue indexical denotation of a normal second person
pronoun, on which they pick out the addressee of the context (Kaplan 1989).
Again, they are (at the surface) monomorphemic, and they plainly introduce
both at-issue and CIE content, making them mixed content bearers by the
proposed criteria. This means the formal versions can be assigned the fol-
lowing denotation, where s c denotes the speaker of the context and hc its
hearer:
(28) hc _ honor(s c , hc ) : ea × t s
I make use of just an honorific relation here, following Potts & Kawahara
(2004). I do not want to take a position on its content here because mere
use of a pronoun need not indicate that the addressee is actually honored. It
is difficult to decide exactly what should be made of insincere uses of such
pronouns. Potts & Kawahara (2004) analyze Japanese subject honorifics as
36 The term morat-ta ‘Ben-Pst’ is derived from mora-u ‘Ben-Npst’ via morphological operations
that are of no concern to us here.
37 π1 and π2 here are the usual projection functions/pullbacks on product types, which work
to pick out the first or the second element of the product type, respectively.
8:26
performative, so their use already causes the ‘honoring’ relation to hold;

it is not so clear to me that this is the right analysis, for there is a merely
normative or polite use. Perhaps we should understand honor(x, y) in this
way. The same of course holds for the honorifics discussed earlier. I put
these delicate issues aside here.
This is the place to discuss the alternative decompositional analysis in
detail. Potts (2007a) provides an analysis of formal pronouns in terms of
an honorific feature applying to the pronoun meaning. The idea is that
a pronoun consists of a feature bundle which introduces certain kinds of
content via the features themselves. Kratzer (2009) elaborates this sort of
view. This is certainly another possible route for the pronoun case; the
correct answer depends on what the real nature of pronouns is, and on how
much of this should be implemented at the level of interpretation rather
than, say, morphology. I cannot address these large questions in this paper.
My work here merely implements the picture suggested by Horn’s (2007)
work. I am not ultimately certain what the right analysis of pronouns should
be.
However, I am skeptical about the prospects of extending this sort of view
to the general case of mixed content.38 The question ultimately is whether
we need a separate system of types for mixed content at all. Generalizing
from the above, one might wish to maintain the simpler system of LCI and
analyze all mixed content expressions as morphologically complex at the
level of type combination: in other words, to decompose all mixed content
bearers into at-issue parts and CIE parts, and let these parts operate on one
another to yield the right meanings. Could this strategy work? Not without
further elaboration, because cases like the Japanese benefactive above require
multiple operations at the CIE level, which we have seen cannot be handled
by using CI types. I do not see any easy way to get around this problem, even
if one admits shunting types into the system (so adopting L+S CI ; see Appendix
B.1), while rejecting mixed types. But this is largely a technical problem. It is
possible that it might have a solution within the system, though I cannot see
how it would be done.39
38 Thanks to several anonymous reviewers and to Chris Potts (p.c.) for discussion of this point.
39 One possibility would be to perform an extreme decomposition and separate out a ‘mor-
pheme’ from the benefactive of type ht a , t c i which would provide a conventionally implicated
modification of the whole sentence. For this to work out, one would need a way to predicate
properties (e.g. deriving benefit) of individuals occupying roles in the sentence without
doing so directly, which might be done by using neo-Davidsonian event semantics, or a
system providing ‘tags’ for grammatical roles in the way that e.g. LFG does. But allowing the
8:27
Eric McCready
More worrisome, in my view, is the idea of necessarily decomposing all

mixed content terms. One can justify this move in the case of pronouns,
which have independently motivated analyses as feature bundles already. It
may also be justifiable for some pejoratives, like Jap, which is truncated; I
noted previously that one might take the truncation to introduce expressive
content as a separate morpheme. Perhaps it is even possible to decompose
honorifics like irassharu as something like [V COME Hon ], a motion verb with
a separate honorific morpheme. But giving a multimorphemic analysis to
epithets like bum or asshole, pejoratives like Frog or Boche,40 or (especially)
the so-called colored terms that will close the discussion in this section seems
to be a stretch. In at least some of these cases, a decompositional analysis
seems very unnatural. I do not think that a knockdown argument is available
against such analyses — one could always decompose, after all. But in at least
these cases, there is no obvious motivation for decomposition, other than
the limitations imposed by the analytical resources made available in LCI .
Without independent motivation, it seems much more natural just to analyze
them as mixed content bearers. At the very least, one would not want to be
forced to a decompositional analysis by the type system underlying the work.
As a final example of mixed content terms discussed in the literature, and
perhaps the example least amenable to decomposition, let us consider pairs
like Frege’s steed and nag, where the extensions are identical but the attitudes
conveyed distinct (Horn 2007). Terms of this kind initially appear similar
to pejoratives, but they are semantically distinct. While pejoratives express
negative attitudes toward all members of some particular group, steed, nag
and other terms that merely add ‘color’ to an at-issue description (Neale
1999) express positivity and negativity which is directed only at the individual
being described, in the case of predicative uses. Again, these expressions
are monomorphemic and introduce both at-issue and CIE content; they are
therefore mixed content bearers, which do not seem to be decomposable in
any natural way.
multiple morphemes introduced by lexical items in decompositional analyses to take distinct
scope positions and to be of different types opens the door to many impossible readings
and unattested possibilities; the costs of the story seem to far outweigh the benefits here.
40 Again, these pejoratives are selected for their lack of real sting. It is not hard to find other
pejoratives that are clearly monomorphemic in my sense, but most of them are sensitive
enough that I will avoid even their mention, much less their use.
8:28
(29) a. Get my steed from the stable.

at-issue: ‘Get my horse from the stable.’
CIE: ‘My horse is a noble animal.’
b. Get my nag from the stable.
at-issue: ‘Get my horse from the stable.’
CIE: ‘My horse is a useless animal.’
This generalization can be taken to mean that colored terms have denotations
of a similar type to the subject honorifics discussed earlier. We can give them
lexical entries as follows.
(30) a. steed= λx. horse(x) _ λx. noble(x) : he, tia × he, tis
b. nag= λx. horse(x) _ λx. useless(x) : he, tia × he, tis
The behavior of these items in semantic derivations should be obvious by

now; I omit showing details of any derivations.
Let me briefly mention another case provided by McCready & Schwager
(2009), who discuss the Viennese German intensifier ur in this system. One
use of ur is to intensify the meaning of a noun or adjective:
(31) a. Das ist ur interessant.

that is ur interesting
‘That is totally interesting.’
b. Er ist ein ur Idiot.
he is a ur idiot
‘He is a total idiot.’
The meaning of this modifier has two parts. First, it performs intensification
in the at-issue dimension, so (31a) means that the referent of that is extremely
(or ‘totally’) interesting; but the speaker also indicates that she holds some
emotive attitude toward the sentential content. This latter part is expressive
or conventionally implicated, and indeed bears the usual hallmarks of emotive
expressive meanings: for example, it is highly context dependent with respect
to positivity and negativity.41 McCready and Schwager further provide a
formal semantics for the intensifier in L+ CI . The analysis is complex, and I
will not review it here; but it is at least clear that ur passes the tests I have
proposed for mixed content bearers.
41 Footnote 50 discusses the issue of context dependence of emotive meanings further.
8:29
Eric McCready
I suppose that there are many other kinds of mixed content, but most
have not come to the attention of researchers yet. The previous discussion
should at least show the usefulness of the notion. There is plainly much more
work to be done on the range of conventionally implicating and expressive
items in the world’s languages, but I hope that the small sample given here
and in the previous section show that the type-theoretic tools proposed here
have useful application in their analysis.
3 Main CIEs
The logic proposed in the previous section, L+ CI , does more than allow for the
analysis of mixed content. The introduction of shunting types that was shown
to be necessary for that purpose also makes available another possibility for
semantic denotation. As we have seen, the result of composition with mixed
terms is similar in the end to the addition of supplementary information via
conventional implicatures: this similarity is modeled by letting both sorts of
CIE content be conjoined to at-issue content via the bullet. Shunting types,
though, because of their resource sensitivity, allow for a situation where
there is no at-issue content at all. The aim of this section is to show that this
feature of the logic should not be taken as a negative one.
The existence of shunting types implies that it is possible that a particular
sentence (or utterance) can convey only CIE content. We will examine several
cases where this situation appears to be realized. In general, this situation
is somewhat special; the uses of language most often analyzed in linguistic
and philosophical work serve to convey information about the world, rather
than to express aspects of the speaker’s mental state or meta-information
about the conversation, which (arguably) is the function of conventional
implicature. Information about the world is thus conveyed mostly by default
here, or in ways other than via the conventional implicature itself, e.g. when
the ‘primary’ content is present in the context, or entered into it by other
means. This observation suggests a division in content type which we will
find to be borne out, at least at the level of inspection that I can provide in
the present context.
The discussion is structured as follows. In section 3.1, I briefly show
why shunting types imply that CIE content can be primary. Section 3.2
examines a first case, the basic cases of single-word utterances of particles
of the kind introduced in Kaplan 1999. There it is also shown that these
cases exhibit unexpected behavior from the perspective of LCI in that they
8:30
can fall in the scope of certain semantic operators. As it turns out, the
existence of shunting types makes it possible to allow for these cases while
simultaneously retaining Potts’s generalizations about the interaction of
semantic operators and CIE content. Section 3.3 discusses the Japanese
adverbial yokumo, which exhibits a different kind of behavior: while the
denial test supports an analysis of the content of sentences containing this
adverbial as CIE, there is composition within the adverbial scope, unlike what
is found with Kaplan’s particles (as noted by Kratzer 1999). It is shown that
analyzing yokumo as being of shunting type both provides an explanation
of its behavior with respect to denials. 3.4 concludes with some suggestions
about possible related phenomena.
3.1 Why Main Content?
The reason that shunting types allow for utterances with only CIE content
is the resource-sensitivity of these types. The function of shunting types
is to ‘shunt’ at-issue content into the CIE dimension of meaning; because
of the resource-sensitivity of these types, no at-issue content remains. Any
successful derivation will result in an object of type t s . Here is a sample, with
two applications:
α : σa γ : hσ , hτ, υiis
R7
β : τa γ(α) : hτ, υis
R7
γ(α)(β) : υs
Plainly, no at-issue content remains.

We have seen that shunting types are needed in the analysis of mixed
content. But their existence implies that there could be expressions that are
exclusively of shunting type. The rest of this section indicates some instances
of such expressions in various natural languages. Before the empirical facts,
though, two theoretical issues must be addressed; one relatively simple, and
one difficult.
The first issue is that the definition of proof tree interpretation in (20)
cannot be used when an utterance lacks asserted content. The reason is
that the definition assumes the existence of an object of type t a on the root
node, but when there is no asserted content, there is no such object.42 It
is therefore necessary to modify the definition to allow for this case. Note
that it also seems necessary to modify the original definition provided by
42 Thanks to Kai von Fintel (p.c.) for bringing this issue to my attention.
8:31
Eric McCready
Potts (2005) as well, for precisely the same reasons; I therefore modify (20) to
cover the case where the utterance contains only content of type t c as well. I
will simply stipulate that in cases where a sentence lacks asserted content
it is still interpreted as a 2-tuple, but one with a first (left) element which is
always satisfiable. I will denote this trivial assertion by T . The result of all
this is a definition with two distinct cases, one which applies when there is
an asserted proposition, and one which applies when there is not.
(32) Generalized Interpretation (final).
i. Let T be a proof tree with at-issue term α : σ a on its root node,

and distinct terms β1 : t {c,s} , . . . , βn : t {c,s} on nodes in it. Then the
interpretation of T is hα : σ a , {β1 : t {c,s} , . . . ,βn : t {c,s} }i.
ii. Let T be a proof tree with at-issue term α : σ {c,s} on its root node,
and distinct terms β1 : t {c,s} , . . . , βn : t {c,s} on nodes in it. Then
the interpretation of T is hT , {α : t {c,s} ,β1 : t {c,s} , . . . ,βn :
t {c,s} }i.
The second issue is less easily resolved. We have a fairly good idea of
what conditions there are on assertion and what norms govern this speech
act. But these norms do not necessarily apply when there is no asserted
content present in an utterance. What then are the norms of the use of
sentences which have CIE content as their primary content?43 This is a
difficult question and one which might be asked about all uses of CIE content.
It is not really clear at this point exactly what the normative conditions are
on the use of supplementary CIEs, for example. A full answer is therefore
far beyond the scope of this paper. I can only suggest a path toward an
answer here. It seems that what the ‘norms of expression’ are depends on
what kind of act is at issue. In assertion we are, roughly, concerned with the
transmission of true information. If a sentence is false, then a norm has been
violated. With respect to CIE content, one can think of a notion of ‘expressive
correctness,’ following Kaplan; the question then becomes what exactly it
takes for something to be expressively correct. The answer to this turns on
what one takes the function of CIEs to be. It is not clear to me that we have
the necessary understanding of their function yet. Once we do, we will be in
a better position to articulate the norms of expressive use.
Let us now turn to some empirical facts, focusing on particles and adver-
bials.
43 Thanks to Kai von Fintel (p.c.) for raising this question.
8:32
3.2 Particles
Sentence-modifying particles introduce several interesting issues. First, we

can consider the case of particles that do not modify any sentences, such as
man.
(33) Man!
This kind of case is discussed briefly by McCready (2008b). There man was
taken to be a conventional implicature-introducing propositional modifier
that applies to a proposition made available by context. If one agrees with this
analysis (and if one follows the analysis of proposition-modifying sentence-
initial man offered in that paper) one ends up with an undesirable situation
where both man(φ) and φ are directly communicated. The reason is that
man would end up being analyzed as of type ht, tic , which means that one
ends up with the denotation ϕ : t a • man(ϕ) : t c for the sentence. Intuitively,
though, this is not correct: ϕ is not asserted by sentences like the above. To
see this, consider cases where a question is answered with the particle:
(34) a. A: What’s the weather like outside?

b. B: Man!
B’s response is understood roughly as follows: B has some sort of strong

feeling about the weather outside. It is not clear what the weather outside
is actually like. In this kind of case, A is likely to infer that the weather
is extreme in some way, but exactly what way this is depends entirely on
A’s prior knowledge about the weather. We can therefore see clearly that
the proposition man modifies is not asserted by B’s utterance — if it were,
it should be recoverable, but it is not. Still, we should not take this to
mean that nothing about this proposition is communicated, only that this
communication cannot be ‘literal.’
Of course, there is another possibility for analysis. The above discussion
is relevant only if stand-alone man is in fact modifying a proposition. It is also
possible that it is a simple exclamation of the type discussed immediately
below: if this is right, then (33) indicates only that the speaker is in an excited
state. If so, then the conclusion that B’s response in (34) indicates something
about the weather follows completely from inference: given that A has asked
a question about the weather and B is indicating that he is in a heightened
emotional state, it is natural (though defeasible) to conclude that he is excited
8:33
Eric McCready
about the weather. It is not easy to see which of these options is correct, for
it’s not clear that there are empirical tests to distinguish between the two
positions.44 However, as we’ll see, either approach proves to give support to
an analysis of particles that takes them to denote objects of shunting type.
Clearly, on either analysis, stand-alone particles provide another case
where the conventionally implicated content is the primary content of the
utterance. If we assume that a proposition is being directly modified, man
can be typed as
λp. man(p) : ht a , t s i
ignoring the actual content of the particle, which is roughly that the speaker
has some kind of emotional reaction toward p (that it is good or bad).45
This analysis disallows the assertion of p itself, as desired. The question
of how extensively we should take particle meanings to be analyzable in
terms of shunting types is left for another occasion; it turns on the empirical
question of whether or not the propositional content of sentences modified
by particles can serve as answers to questions. In many cases it is clear that
they can, in others, perhaps not.
Another kind of even more obvious case is that of expressives that do not
perform any modification, such as salutations or fully expressive exclama-
tions (cf. Kaplan 1999; Kratzer 1999). On the second analysis of stand-alone
particles like man, they too will fall into this category.
(35) a. Thanks!
b. Good morning.
c. Ouch!
Expressions like these lack truth conditions, though they can be expressively
correct (appropriate) or not. They plainly do not assert anything.46 They can
be analyzed as objects of type t c (or t s ), which simply express something
about the speaker’s mental states or what she takes the situation to be like.
44 We cannot, for instance, make use of the kind of binding tests that proponents of ‘unarticu-
lated constituents’ have taken as evidence for their approach (cf. Stanley 2000 for a use of
these tests, and Cappelen & Lepore 2005 for critical discussion).
45 The semantics of man is discussed in detail in McCready 2008b.
46 As the editors point out, this is so only if one does not accept relevant aspects of the
performative hypothesis, according to which (35c), for example, would assert something like
‘I hereby express ‘ouch!” Discussion of the hypothesis with arguments for and against it can
be found in Levinson 1983.
8:34
Here the extension to L+S c

CI does not at first appear necessary, as type t is
sufficient, given that no combinatorics are taking place; but it is clear that,
in cases like these, the expressive (or conventionally implicated) content is
the main content of the utterance. We thus have a division between cases
of ‘primary’ CIs: one, modeled via shunting types, where the CI content is
functional, and another, apparently modellable either via shunting types or
CI types, where the content is not functional and expresses a constant.
However, it turns out that there are reasons to take type t c to be inappro-
priate for these contexts. The reason is that — by definition — there are no
functional types taking CI types as input. As discussed in detail above, this
is by design: the content of e.g. appositives never seems to fall in the scope
of semantic operators. But certain operators are able to act on expressive
particles such as those discussed by Kaplan: namely, other particles.
(36) a. Ouch, man!

b. Man, ouch!
If man is to modify ouch in these cases, it must be either of type ht c , ti or

ht s , ti (where the output type is also either t c or t s ). But if it takes an object
of type t c as input, the generalization about the semantic independence of
e.g. appositives is lost: we must admit functional types taking CI types as
input. If we assume that ouch denotes something of type t s , though, we can
avoid this situation.
One might think that the two particles are merely adjacent, so neither
need to be analyzed as functional. To see that there is genuine interaction
between the two particles, consider the following two situations.
(37) a. Situation 1: You stub your toe on the curb while walking down
the street with your friend Curly.
b. Situation 2: Your friend Curly suddenly pokes you in the eye with
a fork.
(38) a. Ouch!
b. Ouch, man!
(38a) is an appropriate utterance in either Situation 1 or Situation 2. (38b)

gives an impression of blame: ‘it’s your fault that I am in a position to say
8:35
Eric McCready
this appropriately!’47 This kind of accusation is obviously appropriate in

Situation 2. If uttered in Situation 1, it is somewhat odd: why is it Curly’s
fault that you’ve stubbed your toe? These considerations are enough to make
it clear that man is in fact doing something to the meaning of ouch in (38b),
and so some kind of composition is at work.
Another kind of example comes from the intensifiers discussed by Mc-
Cready & Schwager (2009). One use of these expressions is as propositional
modifiers, which intensify along the expressive dimension, as in (39).
(39) a. John totally came to the party.

b. He fully wiped out, dude.
McCready & Schwager (2009) analyze uses like these as expressing that the
speaker has maximal epistemic commitment to her justification for her use
of the modified proposition, so (39a) would express that the speaker is
maximally committed to her justification (evidence) that John came to the
party. It turns out that these modifiers can also modify purely expressive
items in some dialects of English.
(40) Totally ouch(, dude).
On the McCready and Schwager analysis, this would express that the speaker
has maximal commitment to her justification for uttering ouch, itself an
expressive item. Presumably such justification would be a pain felt by the
speaker or something similar. But the main point for our purposes here is
that ouch is a bearer of purely expressive content. A proper analysis of cases
like these therefore will, again, require modification of expressive content.
We have now seen that there are instances in which purely expressive
content is modified. This means that we must add to the system a provision
for operators that take CIE content as input. But what type of content
should this be? The worry is that, if we allow operators over CI types (σ c ),
the generalizations made by Potts (i.a.) about modification of conventional
implicatures such as the content of appositives are lost. The natural way to
avoid this problem is to analyze man and totally in (39) as operators over
shunting typed objects, so to make them of type ht s , t s i.48 Such types are
47 I believe this follows from the analysis of sentence-final man given in McCready 2008b, on
which it performs a dynamic strengthening of speech acts, though I will not provide details
here.
48 Of course, there is also a need for a typing for these operators that allows them to modify
at-issue content as well: ht a , t s i. Depending on the facts about modification of CIE content,
8:36
easily added to the system (via clause (i) of B.1.1). With this move the Potts
generalizations are maintained in the type system.
I believe that the particles, and particularly the expressives like (35), are
the clearest instances of sentences which lack at-issue content, and, perhaps
as a consequence, are the instances which have received the most attention
in the literature. Let us now turn to another kind of sentence that does not
appear to have at-issue content.
3.3 Yokumo
The second example we will consider are sentences modified by the Japanese
adverbial yokumo. In line with McCready 2004, I will argue that yokumo
introduces three pieces of content: a) a statement of the speaker’s emotional
attitude toward the modified proposition ϕ, b) a statement regarding the
prior probability the speaker assigned to ϕ, and c) a condition on mutual
knowledge of ϕ. Unlike McCready 2004, however, I will analyze conditions
(a) and (b) as conventionally implicated rather than asserted, for reasons
which will become clear. The question of the status of (c) is more difficult to
resolve, but in the end I will conclude that it is presuppositional.
The meaning of yokumo is complex, as may already be clear from the brief
discussion above. Here are some representative examples, with somewhat
rough translations.49
(41) a. Yokumo koko ni kita (na)!

yokumo here to came (PT)
‘You have a lot of guts to come here!’
b. Yokumo ore o damashita (na!)
yokumo me Acc tricked (PT)
‘I can’t believe you had the gall to trick me.’
The most obvious approximation of the meaning of the adverbial is a

simple negative statement about the propositional content.50
it may be that these two typings are consistently available for particles and other such
modifiers. Much more empirical investigation is needed before this question can be answered
definitively.
49 Most examples in this section come from McCready 2004.
50 This is the simplest version of the adverbial meaning. For many speakers, yokumo can also
be used with a positive meaning.
8:37
Eric McCready
(42) yokumo= λp. bad(p)
The second component of yokumo’s meaning involves likelihood. Yokumo

indicates that the speaker did not expect the event described by the modified
sentence to occur, and that she is surprised that it actually did. There are a
variety of ways to model this situation. I will simply make use of a predicate
surprise, which can be given a semantics in terms of probabilities in ways
that are more or less obvious.51 Adding this to the denotation of yokumo
yields
(43) yokumo= λp. bad(p) ∧ surprise(p)
One element of this adverbial’s meaning remains to be analyzed. It was

also discussed by McCready (2004): the proposition modified by yokumo
must be (believed by the speaker to be) common ground. To see that this
proposition must indeed be common ground, note that sentences modified
by yokumo are not felicitous as answers to questions.
(i) omae yokumo konna ii sakuhin kaketa na

you yokumo this-kind-of good artwork write.able-Pst PT
‘I can’t believe you were able to make a piece this good!’
Whether the attitude expressed by yokumo is positive or negative appears to depend on

several factors. First, the content of the sentence: in (41b), the modified proposition describes
an event that (we can assume) was negative for the speaker, while (i) is clearly positive. Other
facts about the world also must play a role, though. Suppose that it is the speaker’s birthday,
and he comes home to find a surprise party. The hearer had told him earlier that everyone
had forgotten his birthday. Here, the tricking lacks a negative character. The identity of
the speaker also obviously plays a role. These facts are reminiscent of what we find with
modification by the particle man (McCready 2008b), which has the introduction of emotional
attitudes as one of its functions. There I introduced a function E which maps Kaplanian
contexts and propositions to emotive predicates; the relevant features of the context, and
the content of the proposition, determine an emotive predicate, which is then applied to the
proposition itself. In these more permissive dialects, the statement bad(p) in the semantics
below should be replaced with E(c)(p)(p), which is interpreted, after application of E to
the context and the proposition, either as bad(p) or good(p). The issue of how the emotive
import of expressives arises is an important one in the context of the study of expressive
meaning and one I hope to return to in later work, but is orthogonal to the purposes of the
present paper, which is mostly concerned with combinatorics.
51 The operator should be defined in terms of probabilities prior to learning that the ‘surprising’
proposition is true, which requires a notion of dynamic changes in probabilities. For
discussion, see Jeffrey 1983, Kooi 2003, or McCready & Ogata 2007.
8:38
(44) a. Context: A asks B ‘Who did Austin marry?’(McCready 2004)

b. #Yokumo Dallas to kekkon sita na!
yokumo Dallas with marry did PT
‘He did an amazingly stupid and shocking thing by marrying
Dallas!’
This example can be taken to indicate that yokumo cannot provide new
information. In my earlier work I modeled this knowledge requirement via
a condition on update: update is only defined if both hearer and speaker
already know the content of the proposition, in conjunction with an assump-
tion of common knowledge. There are several options regarding how this
condition should be stated. On the one hand, it is possible to simply pre-
suppose that CG{s,h} (ϕ), that ϕ is common ground for speaker and hearer;52
on the other hand, taking a less interactive approach to the dynamics of
information, we can simply stipulate that an update with yokumo(p) is only
defined if update with p does not alter the information state of speaker or
hearer. These two conditions amount to the same thing for present pur-
poses.53 I will make use of the former method in this paper.54 We arrive at
the following lexical entry.55
(45) yokumoc = λp : CG{s,h} (p). bad(p) ∧ surprise(p)
52 See van Ditmarsch, van der Hoek & Kooi (2007) for the semantics of this operator.
53 We do not need to concern ourselves with deep questions about the difference between
knowledge and belief here, for instance.
54 In McCready 2004, I took the second route. This decision was partly motivated by the fact
that the particle na can induce felicity, which I took to mean that it can help introduce
content into the common ground. Since I will not consider the action of this particle in this
paper, we can avoid detailed discussion of common ground and update. In any case, it may
well turn out that na has a different function that makes sentences modified by it compatible
with yokumo (McCready, in preparation).
55 One might think that all this is unnecessary, given that surprise(φ) is factive, if we assume
that the logical predicate has the same interpretation as the natural language surprise, which
I see no reason to do. But even if it is presupposed that φ, must we take φ to be common
knowledge? The answer is yes. First, note that what is presupposed by surprise(φ) is not
φ but that the speaker (believes herself to have) learned φ at some past time, which is
already the wrong interpretation. Further, this presupposition should be accommodatable;
but it is not. This is surprising given the results of Kaufmann (2009), who shows that
such presuppositions should be readily accommodatable, unlike presuppositions about
the common ground. I take this to indicate that the presupposition of common ground is
needed.
8:39
Eric McCready
This essentially restates the lexical content originally provided in McCready

2004. However, there is more to the story, as discussed in that paper. In
(45) I have, without argument, taken the common ground condition to be
presupposed, and the other two parts of the meaning to be asserted. But
if they are indeed asserted, it should be possible for a hearer to deny them
directly. However, the content of yokumo(p) cannot be directly denied.
Consider the following example.56
(46) Yokumo Dallas to kekkon shita na!

‘He did an amazingly stupid and shocking thing by marrying Dallas!’
a. # sore-wa hontoo janai

that-Top truth Cop.Neg
‘That’s not true.’
b. # uso da
lie Cop
‘That’s a lie!’
Each of the possible denials in (46) is infelicitous. One might try to explain
this in terms of ‘privileged content’ or speaker relativity; it is known that it is
difficult to make claims about the truth or falsity of claims that depend (in
part) on the speaker’s preferences (cf. Lasersohn 2005; Stephenson 2007).
It makes some sense, given this, that the emotive content of the adverbial
content is hard to deny. But this argument does not go through for the
probability statement.57
The analysis starts with the observation that it is not actually impossi-
ble to deny the content of the adverbial — it just cannot be done with the
responses in (46). Less direct expressions are needed.
(47) Yokumo Dallas to kekkon sita na!

‘He did an amazingly stupid and shocking thing by marrying Dallas!’
a. Chigau yo!
wrong PT
56 Here we suppose that it is known that the referent of ‘he’ is marrying Dallas.
57 If probabilities are understood as subjective, the basis for assertion may indeed be hard to
deny. But it seems clear that statements about likelihood become part of the public domain
once made, so denial of the surprise clause in the denotation of yokumo is surely possible.
8:40
‘That’s wrong!’
b. Sonna koto nai yo!
that-kind-of thing Cop.Neg PT
‘That’s not right.’
These facts are reminiscent of facts noted by Potts (2005) about conventional
implicatures. How can one call the content of a nominal appositive into
question, given that it cannot be denied directly?
(48) Bill, the philanthropist, is very rich.
a. That’s not true. (= Bill is not very rich.)

b. Well, yeah, he is, but that’s not really right . . . (= casts doubt on
the appositive content)
What I will call truth-directed denials like those in (46) cannot target conven-
tionally implicated content, but only asserted content. Denials like (47) can
target either type of content. If we assume that the content of yokumo is con-
ventionally implicated, the facts in (46) are therefore immediately explained.
Note that the fact that truth-directed denial can target the asserted content
in (48) and not in (46) has an immediate explanation: (48) asserts that Bill
is rich, but (46) asserts nothing at all, for it is already common ground that
Dallas and Austin got married.58
58 Another commonality can be found with denials. Note that there are two parts to the
‘deniable’ content of yokumo sentences, given that the proposition modified is already part
of the common ground: the emotive content and the statement of surprise. For many (but
not all) speakers, the denials of yokumo-modified sentences in (47) can only target one of
these, meaning that they can deny the good/badness of the marriage, or its surprisingness,
but not both. The same seems to hold for sentences in English where multiple conventional
implicatures are tied to the same host NP, as in (ia). Here, the denial in (ib) seems to indicate
that either a) John is not a banker, or b) that he does not own a large house. It is difficult
to understand (ib) as denying both together. If this data is correct, the identification of the
content introduced by yokumo as conventional implicature receives additional support.
(i) a. A: John, a banker, who owns a large house, is going bankrupt.

b. B: Well, yeah, true, but . . .
However, none of this follows from the analysis I am going to provide in terms of L+ CI ,
where the adverbial simply introduces a conjunction; unless it is assumed that only a
single conjunct can be targeted by a denial in the case of conventionally implicated content.
Formally, we might take the adverbial to introduce several distinct conditions, for example
8:41
Eric McCready
In previous work, I analyzed these facts in Segmented Discourse Repre-

sentation Theory (SDRT; Asher & Lascarides 2003), in a way related to the
analysis of parentheticals of Asher (2000). Here I will explore a different
approach.59
One may wonder if the above facts about denial are really sufficient evi-
dence to justify treating the content of yokumo as conventionally implicated.
This is legitimate; but, for independent reasons, it is difficult to apply the
other standard test for conventional implicature. It is known that conven-
tional implicatures are scopeless with respect to semantic operators over
asserted content, such as negation, conditionals and the various modalities.
Ordinarily, one would test the behavior of the putative conventional implica-
ture item in operator contexts, and then draw conclusions about whether or
not it is actually asserted. Unfortunately, this proves to be impossible with
yokumo. Yokumo is resistant to appearing in nonveridical contexts, as shown
by McCready (2004).60 Because yokumo is ungrammatical in these contexts, it
is impossible to test its scope behavior, and, as a result, the operator test for
conventional implicature cannot be applied. The same goes for the binding
test. Since yokumo can’t appear in conditional consequents, it is hard to tell
whether or not its content would be bindable. But a conceptual argument
is available. Intuitively, sentences modified by yokumo serve to introduce
new information about the speaker’s mental states and attitudes. If this
content was presupposed, then (on a standard picture of presupposition) the
speaker would be assuming it to be in the common ground. But, intuitively,
in the form of a set of propositions. Before taking this kind of step, though, it is worth
checking to see how stable the denial facts are with respect to ‘multiple denials.’
59 The SDRT analysis involved assuming that each part of the lexical content of the adverbial
introduced distinct speech act discourse referents which were then connected by discourse
relations. This analysis has three problems, as I now see it. First, there is no clear reason
why the denials in (46) are different from those in (47). There is no independently motivated
reason to distinguish between these kinds of denial at the level of discourse structure (to my
knowledge). Second, I had to make an assumption about possible attachment points for the
denials to work out right, which also lacks independent motivation. Third, on my analysis
there, yokumo(p) also was taken to assert p, despite the presence of p in the common
ground already (as shown by the facts in (44)). This strikes me as highly problematic in view
of the norms of assertion: one should not assert things that are already common ground (or
even cannot, if this is taken to be a precondition on assertions). I therefore take the new
analysis presented in the main text to be preferable.
60 The reason for this may relate to evidential behavior: it seems possible that yokumo requires
that the speaker have a certain kind of relation with the proposition it modifies, in a way
related to what is found with sentence-initial man (McCready 2008b). I will not consider this
behavior in detail here.
8:42
the speaker is communicating her attitudes, so the presupposition picture

simply does not seem to be correct.61
Here I will take the results of the denial test to be conclusive, and therefore
treat the content of yokumo as conventionally implicated in what follows
(excluding the presupposition of common ground).62 The question now is
what type to assign it. As with stand-alone man, there are two options: ht a , t c i
and ht a , t s i. Just as with man, there are obvious problems with the first
option. Given the resource-insensitivity of CI types, applying a denotation
of the first option to a proposition ϕ will yield ϕ : t a • yokumo(ϕ) : t c . But
this means that ϕ is asserted, and so it should be deniable. But it is not.
The first option, therefore, cannot be right. Assuming yokumo to be of type
ht a , t s i, however, means that the result of combining the adverbial with a
proposition will be only yokumo(ϕ) : t s ; nothing is asserted, so the denial
facts are predicted. The result is that sentences modified by yokumo carry
only CIE content.
3.4 Conclusion
In this section we have seen several areas in which natural language appears
to make use of the possibilities afforded by shunting types, and have also
had occasion to slightly extend L+ CI to allow for modification of shunting
typed objects. I hope the reader has been convinced of their usefulness. I
do not think that this discussion exhausts the utility of shunting types: for
example, one other area where I think they could be useful is in the analysis
of exclamatives, which have the combinatory properties one would expect
from shunting-typed objects in terms of further combinatorics, given certain
61 This argument seems reasonable, but the presupposition that the modified proposition
is in the common ground is less simple to get clear about. How can we be sure that
presuppositions of this sort, that have no real equivalent in non-technical natural language,
are not actually conventionally implicated? I do not know of a really good way. The issue
is general, and has received a bit of recent discussion by Schlenker (2008), who raises
worries for his theory of presupposition involving complex presuppositions that cannot be
articulated easily or at all in natural language. This is an interesting issue but a difficult one,
and I will not be able to do it full justice in this paper.
62 Another way to interpret these results is to conclude that yokumo introduces a different
kind of content, that behaves in some ways similarly to CIE content (cf. the comments of a
reviewer). This seems possible; but it also seems that, even in this case, it behaves like CIE
content where it can appear. I think this justifies using the present system to analyze it.
8:43
Eric McCready
assumptions.63 They also exhibit semantic similarities with yokumo and even
the modifications done by particles, which suggest a larger correspondence.
The topic is large enough that I cannot do justice to it here. Another area is
expressive small clauses, sentential phrases like (49), discussed by Potts &
Roeper (2006).
(49) You damn fool!
Utterances like this one do not exhibit any at-issue content; there is nothing
for truth-directed denials to target, for example. This fact makes it look like
shunting types should be involved. As Potts and Roeper state, though, it is
not completely clear how the details of the composition should work, and I
cannot improve on their observations here.
In a sense, the conventional implicatures introduced by shunting-typed
content remain supplementary, at least in the cases examined here; the dif-
ference with ‘ordinary’ conventional implicatures of CI type is that shunting-
typed objects supplement content that is already present, and not asserted
by the sentence providing the supplementary information. In the case of
yokumo, this content must be introduced via accommodation, if it is not
already present; but this presents no special difficulties, unlike presupposi-
tions of some kinds of expressive content (e.g. Kaufmann 2009). For some
other instances of CIE content in contexts where no assertion is made, the
situation can be different, for instance in the analysis of the Japanese modal
particle daroo provided by Hara (2008). According to this analysis, daroo(ϕ)
conventionally implicates that µ(ϕ) > 50%, but does not assert anything.
Hara notes that LCI is not appropriate for analyzing this case, in that, given
that this type system returns ϕ itself in the at-issue dimension, Gricean
maxims would be violated by any use of daroo to modify a proposition. L+S CI ,
however, makes the right predictions (assuming that the Hara analysis is cor-
rect.) What these cases have in common is that the conventionally implicated
content is, in some sense, primary to the intent behind the utterance.
63 For instance, one must say something about ‘embedded exclamatives.’ One possible route
is to note that embedded instances of exclamatives show very different behavior from
non-embedded instances, a fact already noted by Rett (2008), who draws a sharp distinction
between the two types.
8:44
4 Quechua Evidentials: a Case Study
Let us now examine a single phenomenon (or group of phenomena) that seems
to make use of all the types of content discussed here. This is the system
of Quechua evidentials, for which L+ CI can provide an alternate analysis to
the proposal of Faller (2002), on which these evidentials modify speech acts.
I will begin by giving the basic background and facts that a theory of the
evidentials should explain. I then briefly present Faller’s speech act-based
analysis and show (following McCready 2008a) that, despite the conventional
implicature-like behavior of the evidentials, an adequate analysis cannot
be given in LCI . I then show that such an analysis is available in L+ CI . The
intent is to duplicate the basics of Faller’s analysis as closely as possible in a
conventional implicature-based system which does not make use of speech
acts. I should make two caveats before embarking on this project. First, the
proposal I make here does not account for many of the subtle issues that
arise in the Quechua evidential system, only the most basic, brutal facts about
the way in which composition seems to work for the different evidentials
in the language.64 Second, the analysis of Faller (2002) is by no means the
last word on this subject. More recent work by Faller (2003, 2007, 2006)
introduces additional complexities, which I will also leave aside. This section
should therefore be taken as only a sketch of an alternate analysis, in which
we see how one can ensure some kinds of scope behavior without making
anything other than lexical stipulations about types of content.
Cuzco Quechua has several enclitic suffixes that mark evidentiality:
roughly, the nature of the speaker’s justification for the claim made by
the utterance. Faller analyzes three suffixes in detail. The first is the direct
evidential -mi, which indicates that the speaker has the best available grounds
for the claim made, which generally amounts to perceptual evidence. The
second, -si, is a hearsay evidential which indicates that the speaker heard
the information expressed in the claim from someone else. Finally, -chá, an
inferential evidential, indicates that the speaker’s background knowledge,
plus inferencing, provides evidence for the proposition the modified sentence
denotes, and asserts that the sentence might be true.
(50) a. Para-sha-n-mi
rain-Prog-3-mi
64 I also restrict attention to assertions; complex issues arise with questioning evidentials in
this language, which I am not sure how should best be addressed.
8:45
Eric McCready
‘It is raining. + speaker sees that it is raining’

b. para-sha-n-si
rain-Prog-3-si
‘It is raining. + speaker was told that it is raining’
c. para-sha-n-chá
rain-Prog-3-chà
‘It may be raining. + speaker conjectures that it is raining based
on some sort of inferential evidence’
Cuzco Quechua evidentials do not embed semantically; even when they

appear in the surface scope of semantic operators, they always take widest
scope (or are scopeless with respect to such operators). The negation in the
following example cannot take scope over the evidential, for instance.
(51) Ines-qa mana-n/-chá/-s qaynunchaw ñaña-n-ta-chu

Ines-Top not-mi/chà/si yesterday sister-3-Acc-chu
watuku-rqa-n
visit-Pst1-3
‘Ines didn’t visit her sister yesterday.’ (and speaker has evidence for
this) NOT ‘Ines visited her sister yesterday’ (and speaker doesn’t have
evidence for this)
A final basic fact that a theory of evidentials in this language must explain
is that use of the hearsay evidential with a sentence does not commit the
speaker to the content of the sentence. For instance, the first clause of the
following sentence does not commit the speaker to the proposition that a lot
of money was left for the speaker, as the continuation shows.
(52) Pay-kuna-s ñoqa-man-qa qulqi-ta muntu-ntin-pi saqiy-wa-n,

(s)he-PL-si I-Illa-Top money-Acc lot-Incl-Loc leave-1o-3
mana-má riki riku-sqa-yki i un sol-ta centavo-ta-pis
not-Surp right see-PP-2 not one Sol-Acc cent-Acc-Add
saqi-sha-wa-n-chu
leave-Prog-1o-3-Neg
‘They left me a lot of money (they said/it was said), but as you have
seen, they didn’t leave me one sol, not one cent.’ (Faller 2002:191)
Thus, roughly, what is needed is the following result, where the evidential
content is not asserted:
8:46
(53) a. mi(φ) î φ ∧ speaker has direct evidence for φ

b. si(φ) î speaker has hearsay evidence for φ
c. cha(φ) î ♦φ ∧ speaker has inferential evidence for φ
Faller uses Vanderveken’s (1990) speech act theory for her analysis. This
theory, like other theories of speech acts, assigns them preconditions for
successful performance. Faller takes evidentials to introduce additional
content into the set of preconditions. For the cases under consideration, we
need only be concerned with one kind of precondition: sincerity conditions
on successful performance of the speech act. For assertions, Vanderveken
takes it to be necessary that Bel(s, p) holds — that the speaker believes the
content of the assertion.65
Most of the action in Faller’s analysis of -mi and chá is in the sincerity
conditions for the assertion. On her analysis, -mi adds an additional sincerity
condition to the assertion, that Bpg(s, φ). The formula Bpg(s, φ) means that
the speaker has the best possible grounds for believing φ. It is very difficult
to make this condition precise. Faller notes that what counts as best possible
grounds is dependent on the content in the scope of -mi: for externally visible
events Bpg will ordinarily be sensory evidence, while for reports of people’s
intentions or attitudes even hearsay evidence will often be enough.
Faller analyzes -chá as being simultaneously modal and evidential. The
asserted content is therefore ♦φ when φ is modified by -chá; the correspond-
ing sincerity condition also involves ♦φ instead of φ. A sincerity condition
indicating that the speaker’s reasoning has led him to believe that φ might
be possible is also introduced. The hearsay evidential -si is also complex; the
propositional content p is not asserted when this hearsay evidential is used,
as we saw, which means that the propositional content of the utterance can-
not be asserted. Faller posits a special speech act present for this situation,
on which the speaker simply presents a proposition without making claims
about its truth. In addition, the sincerity condition requiring that the speaker
believe φ is eliminated, and a condition stating that the speaker learned φ
by hearsay is added.
While considering the degree to which the semantics of evidentials can
be viewed as homogeneous, McCready (2008a) attempted to provide a con-
ventional implicature-based analysis of the Quechua system. It seems plain
that the evidentials of this language behave in a way similar to conventional
65 This is only a very rough approximation of the normative conditions on assertion. See e.g.
Searle 1969 and Siebel 2003 for discussion.
8:47
Eric McCready
implicatures: they are scopeless, do not participate in denial,66 and so on.

However, an adequate semantics cannot be provided in LCI . To see this,
it suffices to consider -si : although si(φ) does not entail φ, taking si to
introduce a conventional implicature causes φ to be asserted, given a LCI
analysis where si is an object of type ht a , t c i. As we have already seen, the
combinatorics, together with (16), yield hφ, {si(φ)}i in this situation; this
means that φ is asserted, so the analysis fails.
However, with the extension of LCI to L+ CI , we have more options avail-
able. In fact, when one examines the conditions in (53), it can be seen that
they correspond to the three kinds of content we have discussed. The di-
rect evidential appears to provide the ‘ordinary’ supplementary content of
Pottsian conventional implicatures; the hearsay evidential, given that it makes
no claims about the truth of the content it applies to, acts to provide the
conventionally implicated main content of its utterance, and the inferential
evidential, given that it has effects in both the at-issue and CI dimensions, is
of mixed type. With this observation, an analysis becomes available. Here I
do not delve deeply into the content of the evidentials, instead making use
of predicates Bpg ‘there are best possible grounds for’, Hearsay ‘there is an
event of hearsay of’, and Inf, a relation between individuals and propositions
indicating that the first element has inferential evidence for the second.67
(54) a. mi= λp. Bpg(p) : ht a , t c i

b. si= λp. Hearsay(p) : ht a , t s i
c. cha= λp. ♦p _ λp. Inf(s, p) : ht a , t a i × ht a , t s i
Applied to a proposition φ, these lexical entries will, respectively, yield

the following:
66 See Faller 2002 for details.
67 It is possible to spell at least some of this out in McCready & Ogata’s (2007) evidential logic.
This logic is dynamic and makes use of discourse referents for evidence sources, sorted
according to the type of evidence they provide (hearsay, visual, etc.). Quinean occasion
sentences are associated with a predicate E and are associated with an agent a, the evidence
holder, and a source i, the source of the content. McCready (2008a) gives a first attempt
at using this logic for the Quechua system. The idea is that Hearsay(p) can be defined by
making use of a test over Eia p-events where Sort(i)=hearsay and Inf(s,p) can be defined via a
test over Eia p-events where Sort(i)=judgemental. It is a bit harder to define Bpg(s, p), because
its satisfaction conditions are dependent on the content of p itself; but it should be possible.
I will not go further into this issue here.
8:48
(55) a. hφ, {Bpg(φ)}i

b. hT , {Hearsay(φ)}i
c. h♦φ, {Inf(s, φ)}i
These are precisely the desired results. This sketch of an analysis for the
Quechua evidential case thus provides an example of a situation in which the
full power of L+CI is needed to analyze a single linguistic phenomenon. Of
course, the question of whether this analysis or Faller’s speech act-based one
is to be preferred for this case is separate, and depends on working out the
details of the conventional implicature story in connection with looking at
a wider array of more complex data. Still, at minimum, the discussion here
shows that a speech act analysis is not the only possibility for the phenomena
in question.
5 Conclusion
This paper has made two major contributions. It has distinguished and pro-
vided a logical system for the analysis of three distinct types of conventional
implicature: supplementary CIEs as modeled in Potts 2005, CIEs that provide
main content, analyzed in L+S CI as being of shunting type, and mixed CIEs, an-
+
alyzed in LCI . This typology is novel and is one that I think helps significantly
in understanding CIE phenomena. I doubt it is exhaustive, however. It seems
possible that the three categories analyzed need further subdivision, even
in terms of their typing (there is obvious need for subdivision in terms of
content). I believe that these systems will be useful for researchers working
to understand the range of conventional implicature in the world’s languages;
I hope the above discussion has provided some support for this belief. In
the process, the paper has analyzed a number of phenomena involving CIE
content, mostly of mixed or shunting type: these analyses are the second
contribution of the paper.
One question that has not been addressed in any detail is the nature of
the distinction between conventional implicature and expressive content,
or even if there is any empirical distinction. I think that, in terms of their
combinatorics, there might well not be any difference. The two show a similar
lack of interaction with most kinds of semantic operators (embedding under
attitudes being a significant exception), which suggests that they act similarly
in terms of compositional semantics. At the present moment, there has
not been sufficient empirical investigation for this point to be really clear.
8:49
Eric McCready
My suspicion is that the difference between expressive and CI lies in the

type of meanings that are carried rather than how those meanings behave in
composition, and so that the distinction is one that cross-cuts the distinctions
embodied in L+ CI .
Another issue that arose several times in this paper is the nature of the
divide between presupposition and conventional implicature. I suggested that
(in part at least) it comes down to a difference in function. Presuppositions
aim to ‘match’ old information with new; conventional implicatures instead
work to introduce new information, but information that is not ‘open to
question’ in the way that asserted content is, instead serving to indicate the
speaker’s attitudes and commitments. This distinction is useful in cases
where the standard tests break down due to the complexity of a given piece
of content or the lack of a way to express it in a given (formal or natural)
language, as we saw. The particular examples provided here also raise
questions about the degree of translatability one can find for non-at-issue
domains in natural languages. It seems likely to me that Katz (1978) was
right in his thesis that any piece of content in a natural language L can be
translated into any other language L0 — if one restricts attention to at-issue
content. Whether this thesis holds for presupposition or for conventional
implicature strikes me as more problematic (and not me alone: see Keenan
1974 and von Fintel & Matthewson 2008). The data in this paper suggests that
in certain complex cases, translation of these kinds of non-truth-conditional
content might be difficult or impossible, if there is no term in the target
language with the same semantics. For example, it is not at all obvious how
one might translate a sentence containing honorifics, or (certain) evidentials,
or particles of the kind discussed in this paper, into a language without
similar constructions, in a way that preserves meaning.68 It is my hope that
the work described in the present paper will contribute to solving questions
like these, and, in general, to the theory of natural language meaning.
A Formal System of Potts (2005)
Here is the type system of LCI .
i. The type system itself is as follows.
a. ea , t a , s a are basic at-issue types forLCI .

68 This task is difficult even in the most basic sense of content-level equivalence. If one specifies
a translation that also preserves pragmatic and discourse-level behavior, it is even harder.
8:50
b. ec , t c , s c are basic CI types for LCI .

c. If σ and τ are at-issue types for LCI , then hσ , τi is an at-issue
type for LCI .
d. If σ is an at-issue type for LCI and τ is a CI type for LCI , then
hσ , τi is a CI type for LCI .
e. If σ and τ are at-issue types for LCI , then hσ × τi is a product
type for LCI .
f. The full set of types for LCI is the union of the at-issue types and
CI types for LCI .
ii. Further, let x serve as a variable over {e, t, s} and let σ and τ serve as
variables over well-formed types with their superscripts stripped off.
The type-superscript abbreviator is defined as follows:
xa xa
xc xc
hσ a , τ a i hσ , τia
hσ a , τ c i hσ , τic
B Modified Type System: L+

CI
I define two type systems here. The first, L+S

CI , introduces shunting types. The
+S
second, L+ CI , builds on L CI to allow for the use of mixed content terms as
well. The reason for defining the two systems independently is that the full
power of the extended system will not be needed for all applications, and it
may be convenient for users of the types proposed here to have a subsystem
at hand that fits their needs.
B.1 Shunting types: L+S

CI
Here is the type system of L+S

CI , which is just that of LCI supplemented with
additional shunting types. I follow Potts in my definition, which means that
many shunting types are produced that do not get used (just as with the CI
types of LCI ).
• The type system itself is identical to that of LCI except that:
i. The following clauses are added to the LCI type specification:
8:51
Eric McCready
(g) es , t s , s s are basic shunting types for L+S

CI .
(h) If σ is an at-issue type for L+S

CI and τ is a shunting type for
LCI , then hσ , τi is a shunting type for L+S
+S
CI .
(i) If σ is a shunting type for L+S CI and τ is a shunting type for

L+S
CI , then hσ , τi is a shunting type for L+S
CI .
ii. Clause (f) of the LCI type specification is replaced with

f’. The full set of types for L+S
CI is the union of the at-issue types,
the CI types and the shunting types for L+S CI .
iii. All instances of ‘LCI ’ in the LCI type specification are replaced
with ‘L+S
CI ’.
iv. The following two clauses are added to the definition of the
type-superscript abbreviator :
xs xs
hσ a , τ s i hσ , τis
• This type definition, bundled with the LCI rules (R1-6), the newly
defined rule (R7), and the revised interpretation mechanism in (32),
comprises L+S
CI .
B.2 The full system: L+

CI
The full system adds some rules to L+S

CI .
• The type system is identical to that of L+S

CI except that:
i. The following clauses are added to the L+S

CI type specification.
69
(i) If σ andτ are at-issue types for L+ CI , and ζ and υ are shunting
+
types for LCI , then σ × ζ, hσ , τi × ζ, σ × hτ, ζi and σ × hζ, υi
are mixed types for L+ CI .
(ii) If σ , τ and ζ are at-issue types for L+ CI and υ is a shunting

type for L+CI , then hσ , τi × hζ, υi is a mixed type for L+CI .
69 Comment: It is not necessary to use most of the types produced by clause (i) for the analyses
made in the present paper. However, I will make such types available in the logic: I do not
think it wise to restrict the type system too much in view of our limited current knowledge of
the range of mixed type expressions in natural language. Here I in effect follow the practice
of LCI , where a wide range of CI types is made available, although in practice only a narrow
range of them ends up being used.
8:52
ii. All instances of ‘L+S +S

CI ’ in the LCI type specification are replaced
with ‘L+CI ’.
• This type definition, together with the LCI rules (R1-7) and the new
rules (R8,9) and the interpretation rule (32), comprise L+
CI .
References
Amaral, Patricia, Craige Roberts & E. Allyn Smith. 2008. Review of ‘The
logic of conventional implicatures’ by Christopher Potts. Linguistics and
Philosophy 30(6). 707–749. doi:10.1007/s10988-008-9025-2.
Asher, Nicholas. 2000. Truth conditional discourse semantics for parentheti-
cals. Journal of Semantics 17(1). 31–50. doi:10.1093/jos/17.1.31.
Asher, Nicholas & Alex Lascarides. 2003. Logics of conversation. Cambridge:
Cambridge University Press.
Asher, Nicholas & James Pustejovsky. 2005. Word meaning and
commonsense metaphysics. Ms., University of Texas Austin and
Brandeis University. http://semanticsarchive.net/Archive/TgxMDNkM/
asher-pustejovsky-wordmeaning.pdf.
Bach, Kent. 1999. The myth of conventional implicature. Linguistics and
Philosophy 22(4). 327–366. doi:10.1023/A:1005466020243.
Bach, Kent. 2006. Review of Christopher Potts, ‘The logic of con-
ventional implicatures’. Journal of Linguistics 42(2). 490–495.
doi:10.1017/S0022226706304094.
Barker, Chris & Pauline Jacobson. 2007. Direct compositionality. Oxford:
Cappelen, Herman & Ernest Lepore. 2005. Insensitive semantics. Oxford:
Blackwell.
Carpenter, Bob. 1998. Type-logical semantics. Cambridge, MA: MIT Press.
Chierchia, Gennaro. 1998. Reference to kinds across language. Natural
Language Semantics 6(4). 339–405. doi:10.1023/A:1008324218506.
van Ditmarsch, Hans, Wiebe van der Hoek & Barteld Kooi. 2007. Dynamic
epistemic logic. Berlin: Springer.
Dummett, Michael. 1973. Frege: Philosophy of language. London: Duckworth.
Faller, Martina. 2002. Semantics and pragmatics of evidentials in Cuzco
Quechua. Stanford, CA: Stanford University dissertation.
Faller, Martina. 2003. Propositional- and illocutionary-level evidentiality in
Cuzco Quechua. In Jan Anderssen, Paula Menendez-Benito & Adam Werle
8:53
Eric McCready
(eds.), The proceedings of the second conference on the semantics of under-

represented languages in the Americas [SULA 2], 19–34. Amherst: GLSA.
Faller, Martina. 2006. Evidentiality above and below speech acts. Unpub-
lished ms. http://personalpages.manchester.ac.uk/staff/Martina.T.Faller/
documents/Evidentiality.Above.Below.pdf.
Faller, Martina. 2007. The Cuzco Quechua reportative evidential and rhetorical
relations. In Andrew Simpson & Peter Austin (eds.), Endangered languages
(Linguistische Berichte Sonderheft 14), 223–252. Hamburg: Helmut Buske
Verlag.
von Fintel, Kai & Lisa Matthewson. 2008. Universals in semantics. The
Linguistic Review 25(1-2). 139–201. doi:10.1515/TLIR.2008.004.
Fodor, Jerry. 2002. Concepts. Oxford: Oxford University Press.
Geurts, Bart. 2007. Really fucking brilliant. Theoretical Linguistics 33(2).
209–214. doi:10.1515/TL.2007.013.
Grice, H. Paul. 1975. Logic and conversation. In Peter Cole & Jerry Morgan
(eds.), Syntax and semantics III: Speech acts, 41–58. New York: Academic
Press.
Hara, Yurie. 2008. Non-propositional modal meaning. Manuscript, Kyoto
University. http://www.semanticsarchive.net/Archive/WUxZjFiM/darou_
hara.pdf.
Heim, Irene & Angelika Kratzer. 1998. Semantics in generative grammar
(Blackwell Textbooks in Linguistics 13). Oxford, England: Blackwell.
Hom, Christopher. 2008. The semantics of racial epithets. The Journal of
Philosophy 105(8). 416–440.
Horn, Laurence. 2007. Toward a Fregean pragmatics: Voraussetzung,
Nebengedanke, Andeutung. In Istvan Kecskes & Laurence Horn (eds.),
Explorations in pragmatics, 39–69. Berlin: Mouton de Gruyter.
Jeffrey, Richard. 1983. The logic of decision. Chicago: University of Chicago
Press.
Kaplan, David. 1989. Demonstratives. In Joseph Almog, John Perry & Howard
Wettstein (eds.), Themes from Kaplan, 481–566. Oxford University Press.
Manuscript version from 1977.
Kaplan, David. 1999. The meaning of ouch and oops: Explorations in the
theory of meaning as use. Manuscript, UCLA.
Katz, Jerrold. 1978. Effability and translation. In Franz Guenthner & Monica
Guenthner-Reutter (eds.), Meaning and translation: Philosophical and
linguistic approaches, 191–234. London: Duckworth.
Kaufmann, Stefan. 2009. On the projection of expressive presuppositions.
8:54
Paper presented at Workshop on Non-truth-conditional Meaning, DGfS,

Osnabrück.
Keenan, Edward L. 1974. Logic and language. In Morton Bloomfield & Einar
Haugen (eds.), Language as a human problem, 187–196. New York: W.W.
Norton and Company.
Kim, Jong-Bok & Peter Sells. 2007. Korean honorification: A kind of ex-
pressive meaning. Journal of East Asian Linguistics 16(4). 303–336.
doi:10.1007/s10831-007-9014-4.
Kooi, Bartled Pieter. 2003. Probabilistic dynamic epistemic logic.
Journal of Logic, Language and Information 12(4). 381–408.
doi:10.1023/A:1025050800836.
Kratzer, Angelika. 1999. Beyond ouch and oops: How descriptive and expres-
sive meaning interact. Handout of a talk given at Cornell Conference on
Theories of Context Dependency. http://semanticsarchive.net/Archive/
WEwNGUyO/.
Kratzer, Angelika. 2009. Making a pronoun. Linguistic Inquiry 40(2). 187–237.
doi:10.1162/ling.2009.40.2.187.
Kubota, Yusuke & Wataru Uegaki. 2009. Continuation-based semantics for
conventional implicatures and the Japanese benefactive. Poster presented
at SALT 19. http://www.ling.ohio-state.edu/~kubota/papers/ci-salt.pdf.
Lasersohn, Peter. 2005. Context dependence, disagreement, and pred-
icates of personal taste. Linguistics and Philosophy 28(6). 643–686.
doi:10.1007/s10988-005-0596-x.
van Leusen, Noor. 2004. Incompatibility in context: a diagnosis of correction.
Levinson, Stephen. 1983. Pragmatics. Cambridge: Cambridge University Press.
McCready, Eric. 2004. Two Japanese adverbials and expressive content.
In Kazuha Watanabe & Robert B. Young (eds.), Proceedings of SALT 14,
163–178. http://semanticsarchive.net/Archive/2Y3YjAxM/.
McCready, Eric. 2008a. Semantic heterogeneity in evidentials. In Ken Satoh,
Akihiro Inokuchi, Katashi Nagao & Takahiro Kawamura (eds.), New fron-
tiers in artificial intelligence: JSAI 2007 conference and workshops revised
selected papers (Lecture Notes in Computer Science 4914), 81–94. Berlin:
Springer. doi:10.1007/978-3-540-78197-4_10.
McCready, Eric. 2008b. What man does. Linguistics and Philosophy 31(6).
671–724. doi:10.1007/s10988-009-9052-7.
McCready, Eric & Norry Ogata. 2007. Evidentiality, modality, and probability.
Linguistics and Philosophy 30(2). 147–206. doi:10.1007/s10988-007-9017-7.
8:55
Eric McCready
McCready, Eric & Magdalena Schwager. 2009. Intensifiers. Paper presented at

Workshop on Non-truth-conditional Meaning, DGfS, Osnabrück.
Neale, Stephen. 1999. Coloring and composition. In Robert Stainton (ed.),
Philosophy and linguistics, 35–82. Boulder, CO: Westview Press.
Potts, Christopher. 2005. The logic of conventional implicatures. Oxford
University Press. Revised version of 2003 UCSC dissertation.
Potts, Christopher. 2007a. The expressive dimension. Theoretical Linguistics
33(2). 165–198. doi:10.1515/TL.2007.011.
Potts, Christopher. 2007b. The centrality of expressive indices: Re-
ply to the commentaries. Theoretical Linguistics 33(2). 255–268.
doi:10.1515/TL.2007.019.
Potts, Christopher, Luis Alonso-Ovalle, Ash Asudeh, Rajesh Bhatt, Seth Cable,
Christopher Davis, Yurie Hara, Angelika Kratzer, Eric McCready, Tom
Roeper & Martin Walkow. 2009. Expressives and identity conditions.
Linguistic Inquiry 40(2). 356–366. doi:10.1162/ling.2009.40.2.356.
Potts, Christopher & Shigeto Kawahara. 2004. Japanese honorifics as emotive
definite descriptions. In Kazuha Watanabe & Robert B. Young (eds.),
Proceedings of SALT 14, 235–254. http://semanticsarchive.net/Archive/
WZhMmY3N/.
Potts, Christopher & Tom Roeper. 2006. The narrowing acquisition path: From
expressive small clauses to declaratives. In Ljiljana Progovac, Kate Paesani,
Eugenia Casielles & Ellen Barton (eds.), The syntax of nonsententials, 183–
201. John Benjamins.
Pustejovsky, James. 1995. The generative lexicon. Cambridge, MA: MIT Press.
Rett, Jessica. 2008. Degree modification in natural language. New Brunswick,
NJ: Rutgers dissertation.
Richard, Mark. 2008. When truth gives out. Oxford: Oxford University Press.
van der Sandt, Rob. 1992. Presupposition projection as anaphora resolution.
Schlenker, Philippe. 2007. Expressive presuppositions. Theoretical Linguistics
33(2). 237–245. doi:10.1515/TL.2007.017.
Schlenker, Philippe. 2008. Presupposition projection: Explanatory strategies.
Theoretical Linguistics 34(3). 287–316. doi:10.1515/THLI.2008.021.
Searle, John. 1969. Speech acts. Cambridge: Cambridge University Press.
Siebel, Mark. 2003. Illocutionary acts and attitude expression. Linguistics and
Philosophy 26(3). 351–366. doi:10.1023/A:1024110814662.
Sørensen, Morton Heine & Pawel Urzyczyn. 2006. Lectures on the Curry-
Howard isomorphism (Studies in Logic and the Foundations of Mathemat-
8:56
ics 149). Amsterdam: Elsevier Science.

Stanley, Jason. 2000. Context and logical form. Linguistics and Philosophy
23(4). 391–434. doi:10.1023/A:1005599312747.
Stephenson, Tamina. 2007. Judge dependence, epistemic modals, and
predicates of personal taste. Linguistics and Philosophy 30(4). 487–525.
doi:10.1007/s10988-008-9023-4.
Vanderveken, Daniel. 1990. Meaning and speech acts. Cambridge: Cambridge
University Press.
Wang, Linton, Eric McCready & Brian Reese. 2006. Nominal appositives in
context. In Michael Temkin Martínez, Asier Alcázar & Roberto Mayoral
Hernández (eds.), Proceedings of the thirty-third Western Conference on
Linguistics [WECOL 33], 411–423.
Wang, Linton, Brian Reese & Eric McCready. 2005. The projection problem
of nominal appositives. Snippets 11. 13–14. http://www.ledonline.it/
snippets/allegati/snippets10005.pdf.
Williamson, Timothy. 2009. Reference, inference and the semantics of pejo-
ratives. In Joseph Almog & Paolo Leonardi (eds.), The philosophy of David
Kaplan, 137–159. Oxford: Oxford University Press.
Eric McCready
Department of English
Aoyama Gakuin University
4-4-25 Shibuya
Shibuya-ku, Tokyo 150-8366
mccready@cl.aoyama.ac.jp
8:57
doi: 10.3765/sp.3.5
Embedded Implicatures?
Remarks on the debate between globalist and localist theories
Michela Ippolito
University of Toronto
Received 2009-10-24 / Revised 2009-12-28 / Published 2010-03-24
Abstract Geurts & Pouscoulous (2009) present experimental evidence that

embedded implicatures are not systematically available and conclude that
localist theories of implicatures cannot be maintained. I argue that this
conclusion can be strengthened by showing that their findings cannot be
reconciled with a localist theory even when the latter is supplemented with a
formal way to predict when an embedded implicature will be preferred, as
suggested in Chierchia, Fox & Spector 2008.
Keywords: implicatures, scalar implicatures, embedded implicatures, conversational

implicature, local implicature, experimental pragmatics, neg-raising verbs
1 Introduction
Geurts & Pouscoulous (2009) present some interesting experimental data

pointing against a localist view of scalar implicatures according to which
scalar implicatures are systematically generated in embedded as well as in
non-embedded positions.
One case that typically is said to trigger an embedded implicature is the
case of a clause embedded under an attitude verb such as think or believe, as
in (1).
(1) John thinks that Fred heard some of Verdi’s operas.
The implicature that (1) generates — the localist maintains — is that John
thinks that Fred heard some but not all of Verdi’s operas. Assuming a localist
view according to which implicatures are triggered by means of a silent
exhaustive operator O as in Chierchia et al. 2008, the embedded implicature
©2010 Michela Ippolito

Michela Ippolito
in (1) is triggered when O adjoins the embedded clause as shown in (2).1 This
gives rise to the meaning in (3).
(2) John thinks that O(Fred heard some of Verdi’s operas)
(3) John thinks that Fred heard some of Verdi’s operas and John thinks
that Fred didn’t hear all of Verdi’s operas
According to the globalist view, embedded implicatures such as (3) cannot be

generated and (1) can only conversationally implicate that John doesn’t think
that Fred heard all of Verdi’s operas. Suppose that John is knowledgeable
about whether Fred heard all of Verdi’s operas or not. Then, from the weak
implicature that John doesn’t believe that Fred heard all of Verdi’s operas
we can infer that John believes Fred didn’t hear all of Verdi’s operas. This
is what Sauerland (2004) calls the “epistemic step”. The result looks like
the embedded implicature that the localist generates by embedding the
exhaustive operator O, but in fact is just a global implicature strengthened
by means of some assumptions about John’s epistemic state.2
Geurts and Pouscoulous conclude that, on the basis of their findings, the
localist position is untenable. Here I will discuss whether their findings can in
principle be reconciled with a localist theory once the latter is supplemented
with a mechanism for predicting when a given reading will be preferred
or dispreferred. As a paradigmatic case, I will consider the localist theory
defended by Chierchia et al. (2008). My conclusion will support Geurts and
Pouscoulous’s: even when supplemented with a mechanism for determining
when an embedded implicature will be preferred, the localist predictions are
incompatible with Geurts and Pouscoulous’s findings.
Furthermore, I will consider the class of Neg-raising (NR) verbs (Horn
1978) and argue that a localist theory makes predictions which again are
incompatible with Geurts and Pouscoulous’s experimental findings. I will then
sketch a way in which a globalist theory might in principle be able to explain
the experimental differences we find among the NR predicates included in
Geurts and Pouscoulous’s questionnaires. However, further experimental
research is needed in order to ascertain whether the globalist line of argument
suggested here works when extended to other NR predicates.
1 That local implicatures exist has been advocated by several people, even though the idea has
been implemented differently in different proposals. See for example Bach 1994, Carston
1988, Chierchia 2004, Fox 2006, Levinson 1983, Levinson 2000, Recanati 2003, among others.
2 Following Grice (1975), advocates of a globalist theory of implicatures include Gadzar (1979),
Geurts (2009), Horn (1972, 1989), King & Stanley (2006), among many others.
5:2
2 Experiments and results
To test whether the predictions made by the local view of implicatures are
correct, Geurts and Pouscoulous looked at different types of embeddings. In
their first experiment, they considered complex sentences where the scalar
item some is embedded in the nuclear scope of the universal quantifier all;
under a modal verb with a universal force; in the complement of think; and
finally in the complement of want. They compared the results they obtained
in these cases with the rate of implicatures drawn in unembedded clauses and
found that, while scalar implicatures were accepted in the majority of simple
(unembedded) cases, the acceptance rate was much lower in the complex
conditions (with differences among conditions; see section 3.1 below). The ex-
periment used an inference task in which participants were shown a sentence
containing a scalar expression (e.g. some) and were asked whether they would
infer that the corresponding sentence with the stronger scalar expression
(e.g. all) was false. In a subsequent experiment, the authors compared the
rate of local implicatures found using the inference task with the rate of
local implicatures found using a verification task in which participants where
shown a sentence containing a scalar expression and were asked to decide
whether that sentence correctly described a picture that they were shown.
The result of the latter experiment when applied to unembedded clauses
showed that the inference paradigm yields higher rates of scalar implicatures
than the verification paradigm, and therefore that the verification task is a
more reliable way to find out the rate at which people actually draw scalar
implicatures. When applied to the question of whether local implicatures are
drawn in embedded clauses, the verification task performed by Geurts and
Pouscoulous “completely failed to yield the local SIs predicted by mainstream
conventionalism” (Geurts & Pouscoulous 2009). In particular, the authors
tested scalar items (here, some) embedded in downward-entailing (DE) con-
texts (i.e. Not all the squares are connected with some of the circles); scalar
items embedded in upward-embedding (UE) contexts (i.e. All the squares are
connected with some of the circles); and finally, scalar items embedded in
non-monotonic (NM) contexts (i.e. There are exactly two squares that are
connected with some of the circles).
In the next section, taking Chierchia et al. 2008 to be a paradigmatic
example of a localist theory, I will spell out in more detail how this theory
works and I will consider the consequences of Geurts and Pouscoulous’s
experimental results, particularly with respect to the issue of the frequency
with which embedded implicatures are drawn.
5:3
Michela Ippolito
3 Discussion
Geurts and Pouscoulous’s experimental results are not per se a knockdown

argument against embedded implicatures. The localist might object that
Geurts and Pouscoulous’s experimental results do not show that embedded
(or local) implicatures are impossible but only that they are not generally
available, and this is at least consistent with one possible localist view: that
is, that since they seem to be triggered in special circumstances, embedded
implicatures must be possible, even though they are not generally available.
Chierchia et al. (2008) have recently discussed some of the circumstances
where embedded implicatures are triggered. The examples in (4) through (6)
illustrate some of these circumstances.
(4) If you take salad or dessert, you pay $20; but if you take both there is
a surcharge.
(5) Exactly two students wrote a paper or ran an experiment. The others
either did both or made a class presentation.
(6) Mary solved some or all of the problems.
Take (4). Chierchia et al. (2008) argue that, while implicatures are not nor-
mally triggered in the antecedent of conditionals (a DE environment), the
continuation in (4) forces an exclusive interpretation of or in the antecedent
(that is an interpretation of the antecedent strengthened with the scalar
implicature “but not both”) as the only way to guarantee a coherent interpre-
tation for the discourse. Embedding the exhautive operator in the antecedent
guarantees that such an interpretation is generated.3
Someone might initially object to Chierchia et al.’s (2008) argument that,
if embedded and non-embedded implicatures are generated by the same
mechanism–in this case the exhaustive operator O–then you would not expect
local implicatures to be confined to this very special set of cases. The fact that
local implicatures seem to be confined to a very narrow set of cases, and that
occurrences of scalar items (such as some or or) in embedded positions do
not normally trigger local implicatures raises the suspicion that the “effect”
3 Similarly for (5) and (6). In (5), the continuation is argued to force the embedded implicature
giving rise to the interpretation according to which ‘exactly two students wrote a paper or
ran an experiment but didn’t do both’. The continuation in (6) is also argued to force the
embedded implicature so that as a result the interpretation of the sentence will be that
either Mary solved some but not all of the problems or she solved all of them.
5:4
of local implicatures is actually due to a different mechanism, and that these

are not implicatures after all.4
To address the issue of the frequency of embedded implicatures (why
embedded implicatures are much less frequent than global implicatures),
Chierchia et al. (2008) have suggested that there is a preference for the
strongest possible interpretation among the possible readings of a sentence,
and that this might account for why having the exhaustive operator O in
the scope of a DE operator is a dispreferred option since it gives rise to an
interpretation weaker than the one obtained without O. The authors consider
two versions of the “strongest meaning hypothesis” (SMH), as shown in (7)
and (8), both from Chierchia et al. 2008.
(7) SMH1:
Let ϕ be a certain logical form. Let ϕ’s competitors be all the LFs that
differ from ϕ only with respect to where the exhaustivity operator
occur. Then, everything else being equal, ϕ is dispreferred if one of
its competitors is stronger than ϕ.
(8) SMH2:
Let S be a sentence of the form [S . . . O(X) . . . ]. Let S 0 be the sentence
of the form [S 0 . . . X . . . ], i.e. the one that is derived from S by replacing
O(X) by X, i.e. by eliminating this particular occurrence of O. Then
everything else being equal, S 0 is preferred to S if S 0 is logically
stronger than S.
According to SMH1, given a certain logical form, all LFs differing in where the
exhaustivity operator occurs will compete with each other and the strongest
LF will be preferred. According to SMH2, alternative LFs differing in the
placement of the exhaustive operator do not compete with each other but
only with the LF without the operator. Taking Chierchia et al.’s (2008) theory
as the paradigmatic localist theory, the question that arises is whether the
localist theory sketched above supplemented with either SMH1 or SMH2 can
be reconciled with Geurts and Pouscoulous’s experimental results.
4 This might explain why, while focal stress is often needed to bring out the embedded
implicature interpretation, focal stress is not needed to bring out the non-local implicature
interpretation. Chierchia et al. (2008) attribute the fact that focal stress helps the embedded
implicature reading of the sentences they consider to the nature of the mechanism they
appeal to, i.e. covert exhaustification, which is triggered by focus. However covert exhausti-
fication is also supposed to be responsible for the non-local implicature raising the question
why focal stress is a relevant factor in the explanation of one type of implicature but not in
the explanation of the other.
5:5
Michela Ippolito
Consider the predictions made by either version of SMH for sentences

where the scalar item is embedded under the epistemic predicate be certain.
(9) John is certain that Fred heard some of Verdi’s operas.

a. John is certain that O(Fred heard some of Verdi’s operas)
b. O(John is certain that Fred heard some of Verdi’s operas)
The configuration of the operator O in (9a) triggers the local implicature

that John is certain that Fred did not hear all of Verdi’s operas. (9b), on the
other hand, triggers the implicature that John is not certain that Fred heard
all of Verdi’s operas. Consider SMH1 first. Let us assume that α is certain
that ϕ means that α has a justified belief that ϕ is true.5 The assertoric
content of (9) strengthened by the implicature in (9a) gives rise to a meaning
stronger than the meaning obtained by strengthening (9)’s assertion with
the implicature in (9b). If in all of John’s doxastic worlds it is true that Fred
heard some but not all of Verdi’s operas (and if John is justified in having
this belief), then it is not the case that in all of John’s doxastic worlds Fred
heard all of Verdi’s operas (and it is not the case that John is justified in
believing that Fred heard them all). This entailment is asymmetric. Therefore,
SMH1 predicts that the interpretation in (9a) should be the preferred one.
However, assuming that Geurts and Pouscoulous’s findings can be extended
to predicates such as be certain, they show that the embedded implicature in
(9a) is clearly not the preferred interpretation.
Suppose we assume SMH2 instead of SMH1. Because both LFs with the
exhaustive operator convey interpretations stronger than the one conveyed
by the LF without the operator, the proposal predicts that both (9a) and
(9b) should be equally available. But we have already seen that Geurts and
Pouscoulous’s results show that this is not the case: (9a) is dispreferred.
Appealing to independent considerations like the lack of plausibility for the
reading in (9a), in order to explain why it is rare is a dubious move. In Geurts
and Pouscoulous’s experiments, the context plays no role. Therefore, we
expect that the most salient reading (the reading preferred by the participants
in the experiment) will be the one selected by the SMH, but we saw that this
is not the case.
5 I am not claiming that this is all there is to say about what be certain means. All I am
assuming here is that saying that α is certain that ϕ entails that α believes ϕ and has some
justification for believing ϕ.
5:6
Similar considerations apply to modal verbs like wish:
(10) John wishes that Fred would try some of the cookies.
a. John wishes that O(Fred would try some of the cookies)
b. O(John wishes that Fred would try some of the cookies)
The configuration in (10a) triggers the embedded implicature that John wishes
that Fred would not try all of the cookies. In (10b), on the other hand, the
implicature is that John doesn’t wish that Fred would try all of the cookies.
Consider first the prediction made by SMH1. Just like in the previous example,
(10)’s assertion supplemented with the embedded implicature in (10a) gives
rise to a meaning stronger than the meaning obtained by incrementing the
same assertion with the implicature in (10b): if John’s desire-worlds are all
worlds where Fred tries some but not all of the cookies, then it is not the
case that all of John’s desire-worlds are worlds where Fred tries all of the
cookies. However, the reverse does not hold: the assertion together with
(10b) is compatible with a state of affairs where in some of John’s desire-
worlds Fred tries all of the cookies, a possibility ruled out by the implicature
in (10a). Therefore, (10a) is predicted to be the preferred reading of the
sentence in (10) by SMH1. One of the conditions that Geurts and Pouscoulous
tested in one of their experiments was embedding of a scalar item under
want and they found that the embedded implicature reading was not the
preferred interpretation of the sentence. If their results can be extended to
any volitional verb, including wish, they show that the prediction made by
SMH1 is not correct. Similarly for SMH2: in this case, both (10a) and (10b) are
predicted to deliver meanings stronger than the meaning obtained without
O and so the two strengthened interpretations are incorrectly predicted
to be equally available. This is so unless some independent contextual
consideration rules out (10a), but as we observed above the context plays no
role in Geurts and Pouscoulous’s experiment and therefore we do not expect
it to be a factor affecting the subjects’s judgments.
In conclusion, even when supplemented with a formal mechanism for
predicting when an embedded implicature will be preferred or dispreferred,
Chierchia et al.’s (2008) localist theory fails to account for the fact that
embedded implicatures are systematically dispreferred. Appealing to contex-
tual/plausibility considerations in order to override the outcome of the theory
is problematic since in Geurts and Pouscoulous’s experiments judgments
were elicited out-of-context.
5:7
Michela Ippolito
In the next section, I will look at the exceptional behavior of the verb
believe, for which Geurts and Pouscoulous found a higher acceptance rate for
the embedded implicature than in any other complex condition. Even though
the exceptional behavior of believe initially appears to support a localist
theory, I will conclude that it actually constitutes another challenge for it.
3.1 Believe and other Neg-raising verbs
Consider (11), a variant of Geurts and Pouscoulous’s original sentence.6 (11a)

and (11b) give rise to the embedded implicature reading and the global
implicature reading, respectively.
(11) John believes that Fred tried some of the cookies.

a. John believes that O(Fred tried some of the cookies)
b. O(John believes that Fred tried some of the cookies)
According to Horn (1978) and others, believe is a Neg-raising (NR) verb: a

normal utterance of John doesn’t believe Mary lied implies that John believes
that Mary didn’t lie. Similarly, (11b) will imply that John believes that Fred
did not try all of the cookies. Therefore, both configurations in (11a) and
(11b) give rise to the same implicature and, according to both SMH1 and
SMH2, since both available interpretations are equivalent and are stronger
than the LF without O, they should be equally available. Indeed, Geurts and
Pouscoulous found a relatively high rate of acceptance of the embedded
implicature in the believe/think condition (even though, as we saw, it wasn’t
the preferred interpretation), and they acknowledge the possibility that this
“elevated level of positive responses (57.5%) wasn’t merely an artifact” of the
inference model (Geurts & Pouscoulous 2009).
The problem is that a similar prediction is made by Chierchia et al. (2008)
with respect to want.
(12) John wants Fred to try some of the cookies.

a. John wants O(Fred to try some of the cookies)
b. O(John wants Fred to try some of the cookies)
According to the classification in Horn 1978, want is also NR. It follows that
Chierchia et al.’s (2008) localist theory predicts that both (12a) and (12b)
should be equally available. However, the rate of acceptance of the embedded
6 Geurts and Pouscoulous’s sentence was given in (1). (1) is a translation of the French sentence
actually used in the experiment.
5:8
implicature with the modal verb want was low (32%), lower than what they
found in the believe case. The embedded implicature reading is dispreferred,
and nothing in how the exhaustive operator O or SMH work seems to explain
why the embedded implicature is more frequently accepted with believe than
with want.
Geurts and Pouscoulous, following the lines of van Rooij & Schulz 2004
and Russell 2006, sketch a globalist account for why believe shows a higher
acceptance of the embedded implicature: (i) the sentence Bob believes that
Anna ate some of the cookies generates the global implicature that Bob doesn’t
believe that Anna ate all of the cookies; (ii) assuming that Bob has an opinion
about whether Anna ate all of the cookies or not, it follows that Bob believes
that Anna did not eat all of the cookies. Now, in their paper defending a
localist view of implicatures, Gajewski & Sharvit (2009) criticize this type of
globalist account by arguing that appealing to the disjunctive proposition
“either Bob believes that Anna ate all of the cookies or he believes that she
didn’t” in the reasoning above is only plausible because believe is a NR verb
and as such it carries the presupposition that either α believes that ϕ or α
believes that it is not the case that ϕ (as argued in Gajewski 2005). In other
words, according to Gajewski and Sharvit, the globalist account only appears
to work because the predicate is NR and the disjunctive proposition crucial
to the globalist explanation is actually presupposed by the verb. But if this
were correct, then all NR verbs would trigger an embedded implicature since
they all presuppose the relevant disjunctive proposition. But we just saw that
this is not so: the experimental results reported in Geurts and Pouscoulous
show that local implicatures with want are relatively rare, despite want being
a NR verb. A short digression on NR verbs is in order here. I have assumed
with Horn (1978) that want, like believe but unlike wish, is NR based on the
observation that in (13) but not in (14) the first sentence implies the second.
(13) a. I don’t want Mary to leave.

b. I want Mary not to leave.
(14) a. I don’t wish to meet Mary.
b. I wish not to meet Mary.
However, Rooryck (1991) cites the following pair from Horn 1978 against the
view that want/vouloir are NR verbs: while (15) supports the NR hypothesis,
(16) does not.
5:9
Michela Ippolito
(15) a. Je ne veux pas que vous sortiez.

“I don’t want you to leave”
b. Je veux ques vous ne sortiez pas.
I want you not to leave”
(16) a. Je ne voudrais pas être Dieu.
“I wouldn’t want to be God”
b. Je voudrais ne past être Dieu.
“I would want not to be God”
Rooryck concludes that volitional verbs only appear to be NR but in fact they
are not. An exhaustive discussion of this issue is beyond the scope of this
paper. However, what is important in the context of the current discussion
about embedded implicatures is to notice that even if volitional verbs are not
NR, it is still true that in cases such as (15) vouloir behaves like a NR verb
in that the two sentences are judged to be synonymous, just like originally
observed by Horn. Just like in (15), the English rendition of the implicature
in (12b) (i.e. John doesn’t want Fred to try all of the cookies) is also judged to
have a NR interpretation, and so does the French translation with vouloir.7
Therefore, since the logical form in (12b) receives a NR interpretation, it is
expected to pattern like non-volitional NR verbs such as believe with respect
to the computation of the embedded implicatures, and the experimental
results show it does not.
Going back to the main discussion, obviously the globalist needs to
explain the asymmetry between want and believe too. In principle we should
be able to run the reasoning sketched by Geurts and Pouscoulous for believe
with want: (i) (12) generates the implicature that John doesn’t want Fred to
try all of the cookies; (ii) let us assume that John has a definite desire about
Fred’s trying all of the cookies, that is, that either John wants Fred to try all
of the cookies or he wants Fred not to try all of the cookies; (iii) it follows
that John wants Fred not to try all of the cookies. The crucial step is (ii).
What “blocks” (ii) in the want case but not in the believe case?
We saw that dismissing the globalist account by appealing to the pre-
suppositional nature of this disjunctive proposition is not going to work.
According to the Russellian line followed by Geurts and Pouscoulous, an
assumption like “either John wants Fred to eat all of the cookies or John
7 Thanks to Annick Morin for providing the French sentence Je (ne) veux pas que Marie mange
tous les biscuits and for her judgment.
5:10
wants Fred not to eat all of the cookies” is purely contextual and as such
it will be part of the common ground in some contexts but not in others.
Whenever the context grants this assumption, the strengthening of the global
implicature happens, giving rise to an embedded implicature effect without
an actual embedded implicature. If this is correct, then the reason why
subjects assented to the local implicature less frequently in the want case
than in the believe case must have to do with how likely they felt they could
make the relevant disjunctive assumption. In particular, it must be the case
that, in the absence of any context, subjects felt that the assumption in (17a)
was less likely to be true than the assumption in (17b).
(17) a. Either John wants Fred to try all of the cookies or John wants Fred
not to try all of the cookies.
b. Either Bob believes that Anna ate all of the cookies or he believes
that she didn’t.
If it is the case that, out of context, people are less likely to make the
assumption in (17a) than the one in (17b), we expect that it should be much
easier to trigger the apparent local implicature with want if the context allows
one to do so. In the globalist theory, then, plausibility considerations such as
the ones outlined above might be expected to distinguish among other NR
verbs which the localist theory would predict pattern alike with respect to
embedded implicatures. 8
The localist too can appeal to the context (and Chierchia et al. (2008)
leave this door open explicitly in their paper), but appealing to the context
8 One pair of predicates that might be interesting to test experimentally is the pair expect/ought
to, as in John expects Mary to try some of the cookies and Mary ought to try some of the
cookies. According to Horn 1978, both predicates are NR. The localist theory predicts that
both should give rise to a high rate of acceptance of the embedded implicature (“John
expects Mary not to try all of the cookies” and “Mary ought to not try all of the cookies”,
respectively), at least out of context. The globalist theory, on the other hand, would have to
appeal to two different disjunctive propositions in order to strengthen the global implicature
giving rise to an embedded implicature effect.
(18) a. Either John expects Mary to try all of the cookies or John expects Mary not to try
all of the cookies.
b. Either Mary ought to try all of the cookies or Mary ought to not try all of the
cookies.
At least out-of-context, it seems that (18a) would be easier to assume. If indeed (18a) is
more plausible than (18b), then the globalist theory predicts that the acceptance rate for the
embedded implicature should be higher in the expect case than in the ought case.
5:11
Michela Ippolito
in this case is needed to systematically “correct” the predictions of the

theory which are not supported by the experimental findings (recall that the
preferred interpretation according to Chierchia et al.’s (2008) localist theory
augmented with either version of the SMH is not the interpretation preferred
by Geurts and Pouscoulous’s subjects). Finally, we noticed that appealing to
the context in order to override the outcome of the theory does not seem right
in Geurts and Pouscoulous’s experiments since in both the inferential and
the verification tasks the subjects had to make their choice out-of-context.
Since the context plays no role in Geurts and Pouscoulous’s experiments, we
expect the reading selected by the theory to surface undisturbed in people’s
judgments. The fact that the subjects’s judgments did not agree with the
predictions of the localist is therefore problematic.
4 Conclusion
The experimental results presented by Geurts and Pouscoulous are at odds

with the predictions of the localist, in particular the localist theory advocated
by Chierchia et al. (2008) The challenge for the localist is to explain why
embedded implicatures are so infrequent. We focused on the believe and
want conditions in Geurts and Pouscoulous’s experiments. They found that
the acceptance rate for the embedded implicature in the believe condition was
not negligeable (even though the imbedded implicature reading was still not
the preferred one). Since believe is a NR, we observed that the localist view
advocated in Chierchia et al. 2008, together with either version of the strong
meaning hypothesis (what we called SMH1 and SMH2), seems to account
for the elevated rate of positive responses with believe. However, we also
observed that the theory is unable to account for the very low acceptance
rate with want, since want is also a NR verb.
When we considered non-NR verbs such as be certain and wish, which
belong to the same category as believe and want respectively, we saw that
Chierchia et al.’s (2008) localist view augmented with either SMH1 or SMH2
makes incorrect predictions about what should be the preferred interpreta-
tions when a scalar item occurs in an embedded clause.9
9 Whether be certain patterns like believe or not, the localist faces a problem. In the former
case, the localist view faces a problem since be certain is not NR and the embedded impli-
cature reading is predicted by his theory to be the preferred one. In the latter case (i.e. if
be certain does not show the higher rate of acceptance of the embedded implicature found
with believe), the problem is that the localist expects non-NR verbs like be certain to show
high acceptance rate for the embedded implicature.
5:12
Furthermore, we noticed that the difference between want and believe

reported in the experiment we are considering also undermines Gajewski and
Sharvit’s criticism of the globalist view. Gajewski and Sharvit have recently
suggested that the globalist account in Russell (2006) seems to work for
believe only because believe is NR (see above for details). But if that were
true, since want is also NR we would expect the two to pattern in the same
way, but they don’t.
It seems hard to reconcile the rates of acceptance of the embedded
implicatures found with epistemic and volitional verbs with the grammatical
theory of embedded implicatures proposed by the localist. Appealing to the
context in order to override the predictions of the theory is problematic since
the context plays no role in Geurts and Pouscoulous’ experiments. On the
other hand, according to the global theory of implicatures, the appearance of
an embedded implicature is due to the strengthening of the global implicature
in a context where specific assumptions are taken to be part of the common
ground: therefore, if the context plays no role (as in Geurts and Pouscoulous’
experiments) or the relevant assumptions are not made, there will be no
embedded implicature effect. In general, both globalist and localist theories
must appeal to contextual considerations to make the correct predictions but,
unlike in a localist theory, the role played by the context in a globalist theory
is an essential component of the globalist theory itself. Indeed, we noted
that plausibility considerations might be able to account for the contrast
between want and believe reported by Geurts and Pouscoulous. However,
more experimental evidence of the type provided by Geurts & Pouscoulous
(2009) must be collected in order to establish whether these considerations
do play the explanatory role they are expected to play in a global theory of
implicatures.
References
Bach, Kent. 1994. Conversational impliciture. Mind and Language 9(2).

124–162. doi:10.1111/j.1468-0017.1994.tb00220.x.
Carston, Robyn. 1988. Implicature, explicature and truth theoretic semantics.
In Ruth Kempson (ed.), Mental representations: the interface between lan-
guage and reality, 155–181. Cambridge & New York: Cambridge University
Press.
5:13
Michela Ippolito
Chierchia, Gennaro. 2004. Scalar implicatures, polarity phenomena, and the

syntax/pragmatics interface. In Structures and beyond, 39–103. Oxford:
Chierchia, Gennaro, Danny Fox & Benjamin Spector. 2008. The grammatical
pragmatics. In Claudia Maienborn, Klaus von Heusinger & Paul Portner
(eds.), Handbook of semantics, Mouton de Gruyter.
Fox, Danny. 2006. Free choice disjunction and the theory of scalar implica-
tures. http://web.mit.edu/linguistics/people/faculty/fox/free_choice.pdf.
Gadzar, Gerald. 1979. Pragmatics: Implicatures, presuppositions and logical
form. New York: Academic Press.
Gajewski, Jon. 2005. Neg-raising: Polarity and presupposition: MIT disserta-
tion. doi:1721.1/33696.
Gajewski, Jon & Yael Sharvit. 2009. In defense of the grammati-
cal approach to local implicatures. http://web2.uconn.edu/sharvit/
Gajewski-Sharvit-23nov2009.pdf.
Geurts, Bart. 2009. Scalar implicature and local pragmatics. Mind and
Language 24(1). 51–79. doi:10.1111/j.1468-0017.2008.01353.x.
Geurts, Bart & Nausicaa Pouscoulous. 2009. Embedded implicatures?!? Se-
mantics and Pragmatics 2(4). 1–34. doi:10.3765/sp.2.4.
Grice, Paul. 1975. Logic and conversation. In Peter Cole & Jerry Morgan (eds.),
Syntax and semantics 3: Speech acts, 41–58. New York: Academic Press.
Horn, Laurence. 1972. On the semantic properties of logical operators in
English: UCLA dissertation.
Horn, Laurence. 1978. Remarks on neg-raising. In Peter Cole (ed.), Syntax and
semantics 9: Pragmatics, 129–220. New York: Academic Press.
Horn, Laurence. 1989. A natural history of negation. Chicago: The University
of Chicago Press.
King, Jeff & Jason Stanley. 2006. Semantics, pragmatics, and the role of
semantic content. In Zoltan Szabó (ed.), Semantics vs. pragmatics, 111–
164. Oxford: Oxford University Press.
Levinson, Stephen. 1983. Pragmatics. Cambridge & New York: Cambridge
University Press.
Levinson, Stephen. 2000. Presumptive meanings: The theory of generalized
conversational implicatures. Cambridge, MA: MIT Press.
Recanati, François. 2003. Embedded implicatures. Philosophical Perspectives
17(1). 299–332. doi:10.1111/j.1520-8583.2003.00012.x.
van Rooij, Robert & Katrin Schulz. 2004. Exhaustive interpretation of complex
5:14
sentences. Journal of Logic, Language and Information 13(4). 491–519.

doi:10.1007/s10849-004-2118-6.
Rooryck, Johan. 1991. Negative and factive islands revisited. Journal of
Linguistics 28(2). 343–374. doi:10.1017/S0022226700015255.
Russell, Benjamin. 2006. Against grammatical computation of scalar implica-
tures. Journal of Semantics 23(4). 361–382. doi:10.1093/jos/ffl008.
Sauerland, Uli. 2004. Scalar implicatures in complex sentences. Linguistics
Michela Ippolito
Department of Linguistics
University of Toronto
130 St. George Street
Toronto, ON M5S 3H1
Canada
michela.ippolito@utoronto.ca
5:15
doi: 10.3765/sp.3.7
Embedded implicatures observed:

A comment on Geurts and Pouscoulous (2009)∗
Charles Clifton, Jr. Chad Dube
University of Massachusetts Amherst University of Massachusetts Amherst

2010-07-08 / Published 2010-07-28
Abstract
Conventionalist theories of scalar implicature differ from other accounts
in that they predict strengthening of embedded scalar terms. Geurts &
Pouscoulous (2009a) argue that experimental support for this prediction is
largely based on sentence comprehension tasks that inflate the frequency
with which terms like some are strengthened. Using a picture verification
task, they observed no strengthening of embedded scalars. We present
data from a multiple-choice picture verification task that is more sensitive
to interpretation preferences, and find that readers do show a preference
for strengthened interpretations even in embedded phrases. These data
cast doubt on Geurts and Pouscoulous’s empirical arguments against the
existence of embedded implicatures.
Keywords: implicatures, scalar terms, interpretation, psycholinguistics
1 Introduction
Geurts & Pouscoulous (2009a)1 present data arguing against what they call
“mainstream conventionalist” and “minimal conventionalist” accounts of the
strengthening of scalar terms like some. Both positions (see Chierchia, Fox &
∗ Acknowledgements: We thank Lyn Frazier for comments on an earlier version of our
manuscript. We thank Maria Bonilla and Morgan Mendes for their assistance in this research.
This project was supported in part by Grant Number HD18708 from NICHD to the University
of Massachusetts. The contents of this paper are solely the responsibility of the authors and
do not necessarily represent the official views of NICHD or NIH.
1 See Chemla 2009, and Geurts & Pouscoulous 2009b, for more discussion.
©2010 Clifton & Dube

Clifton & Dube
Spector 2008 for a survey; see Geurts & Pouscoulous 2009a for additional
references) claim that an “exclusivity” or O-operator is freely prefixed to
any S node with the result that a proposition containing some X, X or Y,
etc. is strengthened to ‘some but not all,’ exclusive ‘or’, etc. Mainstream
conventionalism claims that the strengthened interpretation is the preferred
interpretation, unless it occurs in a context (e.g. a downward-entailing con-
text) which results in a logically weaker global interpretation of the sentence
in which it occurs. Minimal conventionalism merely claims that the strength-
ened interpretation is possible, but says nothing about preference.
One way to evaluate conventionalist approaches is to examine ‘embedded
implicatures’ (or, following Geurts & Pouscoulous, ‘local scalar implicatures’).
Consider a sentence like (1) (Geurts & Pouscoulous’s (7a)):
(1) All students read some of Chierchia’s papers.
Insertion of the exclusivity operator under the scope of all students entails
that all students read some but not all of Chierchia’s papers and thus that no
students read all of Chierchia’s papers. This should be the preferred reading
according to mainstream conventionalism, because it is a stronger (more
limited) claim than the non-strengthened claim. It is also a possible reading
according to minimal conventionalism. However, it is not a pragmatically
justified reading from a Gricean perspective. The author of the statement
presumably did not believe that all students read all of Chierchia’s papers
(else he would have said that). Thus, the pragmatically justified implication
of (1) is (2a). It is not (2b), which is entailed if the exclusivity operator is
inserted.
(2) a. It is not the case that all students read all of Chierchia’s papers.
b. All students read not all of Chierchia’s papers.
Geurts & Pouscoulous (2009a) argue that introspective evidence is not ad-
equate to decide what people usually do take sentences with scalar terms
to mean (an argument that is particularly persuasive when the theorist is
doing the introspecting). They present some very interesting ‘verification’
experiments which they claim disconfirm both flavors of conventionalism
(but are consistent with a construal of Gricean pragmatics). In these experi-
ments, a subject is shown a picture and asked whether a sentence containing
a scalar term ‘correctly describes’ the picture. Their subjects nearly univer-
sally accepted sentences as correctly describing pictures that a strengthened
7:2
Embedded implicatures observed
interpretation of the sentence was not true of. For instance, 100% of Geurts
and Pouscoulous’s subjects accepted the sentence in Figure 1 (from Geurts &
Pouscoulous 2009a) as correctly describing the arrangement shown in the
figure, even though the locally-strengthened interpretation (’all of the squares
are connected to some but not all of the circles’ and thus ‘none of the squares
are connected to all of the circles’) is false of the figure. They concluded,
on the basis of data like these that “the conventionalist approach to scalar
implicatures has little to recommend it” (Geurts & Pouscoulous 2009a, p 431).
All the squares are connected

with some of the circles.
true false
Figure 1 (From Geurts & Pouscoulous 2009a)
Geurts & Pouscoulous (2009a) acknowledged that data they obtained in verbal
“inference” tasks (in which subjects are asked whether a sentence like All
the squares are connected with some of the circles implies All the squares
are connected with some but not all of the circles) exhibited a fair proportion
(on the order of 50%) of strengthened interpretations. However, they state
that such data are suspect. They argue that the proportion of acceptances of
strengthened interpretations is inflated, perhaps because subjects’ attention
is called to the putative implication, so that subjects confuse it with the
legitimate non-embedded Gricean implicature (The square is connected with
some of the circles pragmatically implicates The square is connected with
some but not all of the circles).
We were concerned that the verification task used by Geurts & Pous-
coulous (2009a) has its own bias. Displays like that in Figure 1 can be
7:3
Clifton & Dube
correctly described in many ways: There are squares and circles; Squares
and circles are connected to each other; Some squares are connected to some
circles; etc. A pragmatic perspective does not require that only the strongest
interpretation is a correct description, even if it is the preferred description.
Similarly, while a mainstream conventionalist perspective claims that the
preferred (strengthened) interpretation is not strictly true of the display,
the existence of various weaker but legitimate descriptions of the display
suggests that the non-strengthened interpretation may be acceptable. It
may be that the locally-strengthened interpretation is considered to be the
best interpretation of the sentence, as long as it is the globally-strongest
interpretation. However, Geurts and Pouscoulous’s subjects were not asked
whether the display was the best possible depiction of the target sentence.
They were only asked whether the sentence correctly described the display.
A variety of weaker statements and interpretations can still be considered to
be correct descriptions of the display.
From this perspective, it is tempting to consider what would happen if
the subject were given a choice between two displays, one of which honors
the locally-strengthened interpretation and the other of which violates it. If
the locally-strengthened interpretation is the preferred one (as claimed by
the mainstream conventionalist position), subjects should choose the display
that honors it rather than the one that does not. If minimal conventionalism
is on the right path, then subjects should be equally happy choosing either
display. And the same should be true if Gricean pragmatics rules the day:
the proper interpretation should be ‘All the squares are connected to some
and possibly all of the circles.’
We conducted two experiments, modeled on Geurts & Pouscoulous’s
(2009a) Experiments 2 and 3. In each case, we shifted from a verification
format to a choice format. Subjects were shown a sentence and two figures
(generally one honoring a locally-strengthened interpretation, one honoring
only a basic interpretation; see below for details), and asked to choose which
picture was best described by the sentence: the ‘strengthened’ picture, the
‘basic’ picture, “both,” and “neither.” Both experiments were conducted in
a single session, with randomly intermixed presentation of items including
filler items, as described below.
7:4
2 Experiment 1
The first experiment was based on Geurts and Pouscoulous’s Experiment

2, in which subjects were given a (Dutch) sentence like ‘Some of the B’s are
in the box on the left’ and a picture containing the letters A, B, and C, and
asked “to decide whether [the sentence] correctly describes [the picture]”
(page 16). The left box had all the B’s and all the A’s, and the right box
had all the C’s. Geurts and Pouscoulous present this experiment not as a
test of whether embedded implicatures are made (the sentences evaluated
are simple sentences, presumably supporting the Gricean implicature that
‘not all of the B’s are in the box on the left’) but simply as a check on
the verification technique. They assumed that a subject who made the
strengthened interpretation of ‘Some of the B’s are in the box on the left’
would reject that sentence as being a correct description of a picture where
all the B’s are in the box on the left. In addition to having their subjects
verify whether such sentences correctly described the pictures, they had their
subjects perform a written inference task. Subjects were asked to decide
whether a sentence like ‘Some of the B’s are in the box on the left’ implies
that not all the B’s are in that box. 62% of their subjects accepted the truth
of such a strengthened inference. However, a substantially smaller 34% of
their subjects denied that the sentence correctly described the picture, as
they should have done had they insisted on the strengthened interpretation.
The only claim that Geurts and Pouscoulous made for these data is that
the inference technique yields inflated rates of scalar implicatures. We
conducted Experiment 1 to shed light on whether this is the right claim, or
whether the picture verification technique used by Geurts and Pouscoulous
underestimated the incidence of scalar implicatures.
2.1 Materials
Four some sentences were constructed, as illustrated in (3). One pair of

pictures was made for each sentence, as illustrated in Figure 2.
(3) Some of the stars are in the box on the left.
An additional 84 items (6 practice items plus 78 items from other experi-

ments, including Experiment 2, presented below) were constructed. These
were a mixture of picture verification items and written inference accep-
tance items, and tested both the scalar term some and the term or. We
7:5
Clifton & Dube
present only the some verification data here, for comparability with Geurts
and Pouscoulous.
Please indicate which shape is best described by the sentence below
Some of the stars are in the box on the left.
Figure 2 Illustration of figures used in Experiment 1
2.2 Subjects and Procedures
Thirty-six undergraduates at the University of Massachusetts participated;

they received extra credit in their psychology courses in exchange for their
participation. All subjects were tested individually. They viewed all the items
on a computer monitor, and made their responses on a computer keyboard.
The general instructions for all experiments were as follows:
In this experiment, you will be shown several short sentences.

Following each sentence, there will be a question about the
meaning of the sentence. On some trials,you will also be shown
simple diagrams along with the sentences, and you will be
asked to choose the diagram that is best described by the
sentence. Please read the sentences carefully and answer each
question to the best of your ability.
Subjects then advanced through 6 practice trials containing 3 simple ver-

ification and 3 inference items, followed by the individually-randomized
presentation of a total of 82 experimental trials, including the 4 critical trials
for Experiment 1. The verification instructions for all trials in all experiments
simply asked subjects to ‘Please indicate which shape is best described by
the sentence below.’ The sentence to be evaluated was presented below
the verification instruction, and below the sentence was the diagram. The
response options ‘A’, ‘B’, ‘C (Both)’ and ‘D (Neither)’ were indicated below the
7:6
diagram (see Figure 2). Subjects made the verification response via key-press.
No time constraint was imposed on the subjects, and participation in the
study took approximately 20 minutes.
2.3 Results and Discussion
Table 1 contains the percentages of choices of each of the four options. The
results are very clear. There was a preponderance of choices of the ‘B’ pair
of boxes, in which some but not all of the named items (e.g., stars) were
on the left; there were more choices of B than A: t(35) = 9.8, p < .001, 95%
CI of difference: (.54, .82). This, of course, is the choice that is consistent
with a strengthened interpretation. Choices of ‘both,’ consistent with a non-
strengthened ‘some and possibly all,’ were fairly infrequent and failed to
rise above the arguable chance level of .25 choices of a given option, t(35)
= .13, p = .90, 95% CI : (.13, .35). Choices of the A picture (which Geurts &
Pouscoulous’s subjects accepted 66% of the time) and the ‘neither’ item were
essentially non-existent.
Choice Option
A B* C (“both”) D (“neither”)
3 (2) 71 (6) 24 (5) 2 (2)
Table 1 Percentages of choices of each option (standard errors in parentheses),

Experiment 1. “Strengthened" alternative indicated by *
The methodological implication is clear: The verification task as used

by Geurts & Pouscoulous (2009a) gives a much smaller estimate of the ex-
tent to which readers arrive at a strengthened interpretation of some in a
non-embedded context than does the choice task we used. Geurts and Pous-
coulous apparently assume that subjects will reject a sentence as a correct
description of a picture if the most preferred interpretation of the sentence
is not true of the picture. However, alternative interpretations of a sentence
are possible; it is possible to cancel a scalar implicature. Under such an inter-
pretation, the quantified sentence seems to be a possible description of the
picture, permitting Geurts and Pouscoulous’s subjects to accept it as such.
However, our choice task permitted our subjects to let us know what their
preferred interpretation of the quantified sentences is. They apparently took
7:7
Clifton & Dube
this opportunity to tell us, contrary to Geurts and Pouscoulous’s conclusions,

that they preferred the strengthened interpretation. This methodological con-
clusion justifies re-examining Geurts and Pouscoulous’s verification results
about the (non-) strengthening of embedded implicatures.
3 Experiment 2
The second experiment examined strengthening in embedded implicatures,

using a task like that in Experiment 1. The critical items gave subjects a
quantified sentence containing the scalar term some and asked them to
indicate which of two displays it more accurately described, where one
display pictured the ‘some but not all’ interpretation and the other pictured
the ‘all’ possibility (see Figure 3, version 1; version 2 is a second type of
test, described below). The basic predictions are as follows: If mainstream
conventionalism is correct in a very strict sense, only the display that honors
the strengthened (’some but not all’) interpretations should be chosen. If
minimal conventionalism is strictly correct, the “both” option should be
chosen (and to the extent that a specific display is chosen, each should be
chosen equally often). The interpretation of a sentence strengthened by a
conventional implicature is the denial of “all...all” (e.g., for the sentence All the
squares are connected to some of the circles, it is ‘It is not the case that all the
squares are connected to all of the circles’). Since this interpretation is true
of both the displays in the version 1 portion of Experiment 2, the pragmatic
perspective predicts the same pattern of choices as minimal conventionalism
does.
3.1 Materials
Four sentences were constructed that contained the scalar some. They were
written in two versions each, as illustrated in (4), one with the universal
quantifier all and the other with each.2 Both forms involve embedded impli-
catures, and do not support scalar implicatures from a Gricean perspective.
Each of the four items referred to a different triple of shapes.
2 This manipulation was included based on the intuition – which proved to be incorrect –
that the more individuating nature of each compared to all would discourage a ‘group’
interpretation of the predicate and encourage strengthening.
7:8
(4) a. All of the squares are connected to some of the circles.

b. Each of the squares is connected to some of the circles.
Two different figures, each with two designs, were made up for each of the
four items. An illustration appears in Figure 3. One figure (top panel in
Figure 3, Version 1) contained one design that honored the strengthened
interpretation (the B item) and one design that honored the unstrengthened
‘all’ interpretation. The predictions for these items were laid out earlier. The
other figure (bottom panel, Version 2) was designed so that neither design
was true of the strengthened interpretation. For these items, a reader who
arrived at that interpretation (i.e., a reader who made a local or embedded
implicature) should choose Option D, ‘neither.’ A reader who did not take the
strengthened interpretation should find either display acceptable and ideally
choose Option C, ‘both.’
3.2 Subjects and Procedures
Since they were conducted together, details regarding the subjects and pro-
cedures for Experiment 2 are identical to those of Experiment 1, with the
exception that each subject received 8 critical trials. Each subject saw all four
sentences twice, once where one figure honored the strengthened interpreta-
tion (Figure 3, Version 1) and once where neither figure did (Figure 3, Version
2). Two of each of these had the quantifier all and two, each, counterbalanced
over subjects so that each item was tested with each quantifier equally often.
Apart from this variation, trials differed only in the particular forms used
(circles, triangles, stars, moons, hearts, etc.)
3.3 Results and Discussion
Table 2 contains the percentages of choices of each option. Trials on which

subjects were presented with a design that honored the strengthened in-
terpretation (’Version 1’) provided evidence that they frequently arrived at
the strengthened interpretation: There were substantial numbers of choices
of the design that honored that interpretation, but essentially none of just
the design that was inconsistent with it. t tests comparing the probability
of a strengthened response to .25 indicated significant strengthening for
Version 1, t(71) = 2.59, p < .05, 95% CI : (.28, .48). However, the most frequent
choice was option ‘C,’ “both,” which is the answer that is consistent with
7:9
Clifton & Dube
Please indicate which shape is best described by the sentence below
All/Each of the squares are connected to some of the circles.
Version 1. Figure used where B option illustrated the strengthened

interpretation
Version 2. Figure used where neither option illustrated the strengthened

interpretation
Figure 3 Illustration of figures used in Experiment 2
the non-strengthened, ‘logical,’ interpretation. Indeed, this option was cho-

sen significantly more often than option B, t(71) = 2.13, p < .05, 95% CI of
difference: (.01, .42).
Trials on which neither design honored the strengthened interpretation
received a substantially increased number of option ‘D’ ("neither") interpreta-
tions, which are consistent with a strengthened interpretation of the scalar,
t(71) = 4.39, p < .001, 95% CI of the difference: (.10, .26). Version 2 also
produced substantially more choices of option ‘A’ than option ‘B,’ t(71) = 4.15,
p < .001, 95% CI of the difference (.11, .31), which is further reflected in a
significant increase in the probability of choosing the A figure from Version 1
to Version 2, t(71) = 5.13, p < .001, 95% CI of the difference: (.15, .34). However,
the most-frequent choice was option ‘C,’ “both," the interpretation that is
consistent with the non-strengthened interpretation (vs. option A: t(71) =
2.74, p < .01, 95% CI : (.07, .43)).
7:10
Version 1: B alternative strengthened
Choice Option
Quantifier A B* C (“both”) D (“neither”)

all 3 (2) 39* (7) 57 (8) 1 (1)
each 0 (0) 38* (7) 63 (7) 0 (0)
Version 2: Neither alternative strengthened
all 28 (7) 6 (3) 50 (8) 17* (6)

each 24 (6) 4 (2) 51 (8) 21* (6)
Table 2 Percentages of choices of each option (standard errors in parentheses),

Experiment 2. “Strengthened" alternative indicated by *
The greater frequency of choices of ‘A’ than of ‘B’ is of some interest.

It has two apparent possible interpretations. From a Gricean perspective, a
writer who wanted to describe the B picture would have written Each of the
squares is connected to all of the circles. Since this is not what the sentence
said, the sentence should not be taken to refer to the B picture. From a
local strengthening perspective, the (strengthened) interpretation ‘Each of
the squares is connected to some but not all of the circles’ is falsified by each
of the squares in the B picture, but only by one square in the A picture. This
could have encouraged choice of A as the ‘less-wrong’ alternative.
4 Conclusions
Methodologically, the conclusion is clear: While Geurts & Pouscoulous (2009a)

may be correct in their concern that an inference judgment test yields an
inflated number of instances of apparent strengthening of scalar terms,
their alternative – the picture verification task, as they used it – apparently
underestimates strengthening. When subjects were given a choice between
two figures, only one of which honored the strengthened interpretation, they
showed a distinct preference for choosing that figure. Geurts & Pouscoulous
(2009a) took their verification data to show that subjects never, or almost
never, rejected figures that violated strengthening of an embedded scalar
term. Our data show that our subjects nonetheless showed a substantial
preference for a figure that honored strengthening when given a choice
7:11
Clifton & Dube
between the two types of figures (and further, that they showed a smaller but
still substantial frequency of rejecting both figures when neither honored
strengthening). We submit that Geurts and Pouscoulous’s conclusion that
readers do not make embedded implicatures is based on suspect data, and
hence is at best premature.
Theoretically, though, the cup may be only half full. While our data
show that readers who make the choice between the strengthened and the
unstrengthened interpretation of an embedded scalar strongly prefer the
former, they also show that the most common response is not to choose
between the interpretations but to accept both. Such ecumenism is not a
given; Experiment 1, which tested non-embedded scalar terms, found that
“both” choices were fairly infrequent. The choice of “both” in Experiment
2 presumably reflects the absence of strengthening. Perhaps the right con-
clusion is that an apparently strengthened interpretation of an embedded
scalar term like some is possible, but not obligatory and not even preferred.
This conclusion may present some difficulty to one who holds a pragmatic
Gricean perspective. As Geurts & Pouscoulous (2009a) make clear, Gricean
accounts of strengthening of scalar terms under the scope of (e.g.) think
and believe (Geurts 2009) do not readily generalize to scalar terms under
the scope of all or each. In the absence of a Gricean account of pragmatic
strengthening under the scope of such terms, our results call Gricean ac-
counts generally into question. Similarly, our findings may present some
difficulty for a mainstream conventionalist perspective: It is not clear from
such a perspective why the strengthened interpretation is apparently taken
less frequently than the basic interpretation. The minimal conventionalist
perspective discussed by Geurts & Pouscoulous (2009a) can accommodate
our data, as can a perspective that says that terms like some are simply
ambiguous, but these perspectives are so unconstraining that one would
hope to adopt them only as a last resort. We can conclude only that the
evidence presented by Geurts & Pouscoulous (2009a) has not made a solid
case against the existence of local, embedded implicatures. We trust that
additional experimental research will clarify the conditions under which such
implicatures are made, and hope that additional linguistic analysis will shed
light on why these conditions encourage strengthening.
7:12
References
Chemla, Emmanuel. 2009. Universal Implicatures and free choice effects: Ex-
Chierchia, Gennaro, Danny Fox & Benjamin Spector. 2008. The grammatical
pragmatics. In Claudia Maienborn, Klaus von Heusinger & Paul Portner
(eds.), Semantics: An international handbook of natural language mean-
ing, Berlin: Mouton de Gruyter. http://semanticsarchive.net/Archive/
WMzY2ZmY/CFS_EmbeddedSIs.pdf. To appear.
Geurts, Bart. 2009. Scalar implicatures and local pragmatics. Mind and
Language 24(1). 51–79. doi:10.1111/j.1468-0017.2008.01353.x.
Geurts, Bart & Nausicaa Pouscoulous. 2009b. Free choice for all: a re-
doi:10.3765/sp.2.5.
Charles Clifton, Jr. Chad Dube

Tobin Hall Tobin Hall
135 Hicks Way 135 Hicks Way
University of Massachusetts University of Massachusetts
Amherst, MA 01003 USA Amherst, MA 01003 USA
cec@psych.umass.edu cdube@psych.umass.edu
7:13
doi: 10.3765/sp.3.11
Conjunctive interpretation of disjunction∗
Robert van Rooij

ILLC, University of Amsterdam
Received 2010-02-02 / First Decision 2010-03-21 / Revision Received 2010-04-19 /

Second Decision 2010-04-20 / Revision Received 2010-05-12 / Third Decision 2010-
06-10 / Revision Received 2010-07-13 / Accepted 2010-08-18 / Published 2010-09-15
Abstract In this extended commentary I discuss the problem of how to

account for “conjunctive” readings of some sentences with embedded dis-
junctions for globalist analyses of conversational implicatures. Following
Franke (2010, 2009), I suggest that earlier proposals failed, because they did
not take into account the interactive reasoning of what else the speaker could
have said, and how else the hearer could have interpreted the (alternative)
sentence(s). I show how Franke’s idea relates to more traditional pragmatic
interpretation strategies.
Keywords: embedded implicatures, optimal interpretation, free choice permission
1 Introduction
Neo-Gricean explanations of what is meant but not explicitly said are very
appealing. They start with what is explicitly expressed by an utterance, and
then seek to account for what is meant in a global way by comparing what
the speaker actually said with what he could have said. Recently, some
researchers (e.g., Levinson (2000), Chierchia (2006), Fox (2007)) have argued
that it is wrong to start with what is explicitly expressed by an utterance.
Instead — or so it is argued — implicatures should be calculated locally at
linguistic clauses. For what it is worth, I find the traditional globalist analysis
of implicatures more appealing, and all other things equal, I prefer the global
∗ The content of this paper was crucially inspired by Michael Franke’s dissertation, and earlier
work done on free choice permission by Katrin Schulz. Besides them, I would also like to
thank the reviewer of this paper and the editors of this journal (David Beaver in my case) for
their useful and precise comments on an earlier version of this paper.
©2010 Robert van Rooij

Robert van Rooij
analysis to a localist one. But, of course, not all things are equal. Localists
provided two types of arguments in favor of their view: experimental evidence
and linguistic data. I believe that the ultimate “decision” on which line to
take should, in the end, depend only on experimental evidence. I have not
much to say about this, but I admit to be happy with experimental results as
reported by Chemla (2009) and Geurts & Pouscoulous (2009a) which mostly
seem to favor a neo-Gricean explanation.
But localists provided linguistic examples as well, examples that according
to them could not be explained by standard “globalist” analyses. Impos-
sibility proofs in pragmatics, however, are hard to give. Many examples
involve triggers of scalar implicatures like or or some embedded under other
operators. Some early examples include φ ∨ (ψ ∨ χ) and (φ ∨ ψ). Localist
theories of implicatures were originally developed to account for examples
of this form. As for the first type of example, globalists soon pointed out
that these are actually unproblematic to account for. As for the second type,
Geurts & Pouscoulous (2009a) provide experimental evidence that implicature
triggers like or and some used under the scope of an operator like believe or
want do not necessarily give rise to local implicatures. That is, many more
participants of their experiments infer the implicature (1-b) from (1-a), than
infer (2-b) and (3-b) from (2-a) and (3-a), respectively. Moreover, they show
that there is little evidence that people in fact infer (3-b) from (3-a).
(1) a. Anna ate some of the cookies.

b. Anna didn’t eat all of the cookies.
(2) a. Bob believes that Anna ate some of the cookies.
b. Bob believes that Anna did not eat all of the cookies.
(3) a. Bob wants Anna to hear some of the Verdi operas.
b. Bob wants Anna not to hear all of the Verdi operas.
These data are surprising for localist theories of implicatures according

to which scalar inferences occur systematically and freely in embedded
positions. The same data are accounted for rather easily, however, on a
global analysis.1 Thus, Geurts & Pouscoulous (2009a) argue that localist
theories of embedded implicatures tend to over-generate, and that global
neo-Gricean theories predict much better.
1 See Geurts & Pouscoulous 2009a and Geurts & Pouscoulous 2009b for discussion, and
footnote 18.
11:2
Conjunctive interpretation of disjunction
It is well-known, however, that globalist theories have serious problems

with other examples involving triggers used in embedded contexts as well.
Problematic examples include conditionals with disjunctive antecedents like
(φ ∨ ψ) > χ and free choice permissions like ♦(φ ∨ ψ). Both examples seem
to give rise to “conjunctive” interpretations: from ♦(φ ∨ ψ), for example,
we infer ♦φ ∧ ♦ψ. Standard neo-Gricean analyses like those of Sauerland
(2004) and van Rooij & Schulz (2004), however, do not predict this. Fox (2007)
has shown that this conjunctive interpretation follows once we make use of
recursive exhaustification, and Chemla (2009) has defined a new operator
that can be applied globally to the formula ♦(φ ∨ ψ) and still gives rise
to the desired conjunctive reading. This is certainly appealing, but it is
not so clear that Chemla’s analysis is truly neo-Gricean. In the words of
Geurts & Pouscoulous (2009b), “Defining an operator is one thing; providing
a principled pragmatic explanation is quite another”. Franke (2010, 2009)
provided such a principled pragmatic explanation of these data making use
of game theory.2 The purpose of this paper is to show how this analysis
relates to more traditional pragmatic interpretation strategies. As we will see,
this reformulation also involves multiple uses of exhaustive interpretation. I
will explain how the analysis still differs from the analysis of Fox (2007), and
suggest that it is more Gricean in spirit.
The experimental data of Chemla (2009) are mostly problematic for lo-
calist analyses of implicatures. He found, for instance, that sentences of
the form ∀x(P x ∨ Qx) do not routinely give rise to the expected “local”
implicature that ∀x¬(P x ∧ Qx).3 Still, there is at least one type of ex-
perimental result that, he claims, favors a localist analysis. Chemla (2009)
found that just as for sentences of the form ♦(φ ∨ ψ), sentences of the form
∀x♦(P x ∨ Qx) also give rise to a “conjunctive” interpretation: it licenses
the inference to ∀x♦P x ∧ ∀x♦Qx. Chemla claims that this inference is
predicted by a localist analysis, but not by a globalist one. In section 4.3 we
will come back to this issue.
2 For a rather different pragmatic explanation of these data, see Chemla 2008.
3 Geurts & Pouscoulous (2009a) found something similar, and claim that on the basis of their
data one should conclude that this inference simply never takes place. I am not sure, though,
whether they also tested that the inference also does not take place in case a sentence like
Everybody likes bananas or apples is given as answer to the explicit question What does
everybody like?.
11:3
Robert van Rooij
2 In need of pragmatic explanation
2.1 Conditionals with disjunctive antecedents
It seems reasonable that any adequate theory of conditionals must account

for the fact that at least most of the time instantiations of the following
formula (Simplification of Disjunctive Antecedents, SDA) are true:
(4) (SDA) [(φ ∨ ψ) > χ] → [(φ > χ) ∧ (ψ > χ)]
For instance, intuitively we infer from (5-a) that both (5-b) and (5-c) are true:
(5) a. If Spain had fought on either the Allied side or the Nazi side, it
would have made Spain bankrupt.
b. If Spain had fought on the Allied side, it would have made Spain
bankrupt.
c. If Spain had fought on the Nazi side, it would have made Spain
bankrupt.
Of course, if the conditional is analyzed as material or strict implication,

this comes out immediately. Many researchers, however, don’t think these
analyses are appropriate, and many prefer an analysis along the lines of
Lewis and Stalnaker. Adopting the limit assumption,4 one can formulate their
analyses in terms of a selection function, f , that selects for each world w and
sentence/proposition φ the closest φ-worlds to w. A conditional represented
as φ > χ is now true in w iff fw (φ) ⊆ χ. This analysis, however, does
not make (SDA) valid. The problem is that if we were to make this principle
valid, e.g., by saying that fw (φ ∨ ψ) = fw (φ) ∪ fw (ψ), then the theory would
loose one of its most central features, its non-monotonicity. The principle of
monotonicity,
(6) (MON) [φ > χ] → [(φ ∧ ψ) > χ],
becomes valid. That is, by accepting SDA, we can derive MON on the as-
sumption that the connectives are interpreted in a Boolean way,5 and we
end up with a strict conditional account. We have seen already that the
strict conditional account (or the material conditional account) predicts SDA,
4 The assumption that for any world there is at least one closest φ-world for any consistent
φ — see Lewis 1973 for classic discussion.
5 From φ > χ and the assumption that connectives are interpreted in a Boolean way, we can
derive ((φ ∧ ψ) ∨ (φ ∧ ¬ψ)) > χ. By SDA we can then derive (φ ∧ ψ) > χ.
11:4
but perhaps for the wrong reasons. The Lewis/Stalnaker account does not
validate MON because SDA is not a theorem of their logic. Although there are
well-known counterexamples to SDA,6 we would still like to explain why it
holds in “normal” contexts. A simple “explanation” would be to say that a
conditional of the form (φ ∨ ψ) > χ can only be used appropriately in case
the best φ-worlds and the best ψ-worlds are equally similar to the actual
world. Though this suggestion gives the correct predictions, it is rather ad
hoc. We would like to have a “deeper” explanation of this desired result in
terms of a general theory of pragmatic interpretation.
2.2 Free choice
The free choice problem is a problem about permission sentences. Intu-

itively, from the (stated) permission You may take an apple or a pear one can
conclude that you can take an apple and that you can take a pear (though
perhaps not both). This intuition is hard to account for, however, on any
standard analysis of permission sentences. There is still no general agree-
ment of how to interpret such sentences. In standard deontic logic (e.g.,
Kanger 1981, though basically due to Leibniz (1930)) it is assumed that per-
mission sentences denote propositions that are true or false in a world, and
that deontic operators (like ought and permit) apply to propositions. The
permission ♦φ is considered to be true in w just in case there exists a world
deontically accessible from w in which φ is true. Obviously, such an analysis
predicts that ♦φ î ♦(φ ∨ ψ).7 This analysis does not predict, however, that
♦(φ ∨ ψ) î ♦φ ∧ ♦ψ. According to other traditions (e.g., von Wright 1950,
Lewis 1979), we should look at permission sentences from a more dynamic
perspective. But there are still (at least) two ways of doing this. According
to the performative analysis (cf., Lewis 1979), the main point of making a
permission is to change a prior permissibility set to a posterior one. This
analysis might still be consistent with the deontic logic approach in that it
assumes that what is permitted denotes a proposition. Another tradition
(going back to von Wright 1950) is based on the assumption that deontic
concepts are usually applied to actions rather than propositions. Although
permissions are now said to apply to actions, a permission sentence by itself
6 See Fine 1975.
7 In the philosophical literature, this is sometimes called the paradox of free choice permission,
because it is taken to be problematic.
11:5
Robert van Rooij
is still taken to denote a proposition, and is true or false in a world.8
2.2.1 A conditional analysis with dynamic logic
Let us first look at the latter approach according to which deontic operators
are construed as action modalities. Dynamic logic (Harel 1984) makes a
distinction between actions (and action expressions) and propositions. Propo-
sitions hold at states of affairs, whereas actions produce a change of state.
Actions may be nondeterministic, having different ways in which they can
be executed. The primary logical construct of standard dynamic logic is the
modality hαiφ, expressing that φ holds after α is performed. This modal-
ity operates on an action α and a proposition φ, and is true in world w if
some execution of the action α in w results in a state/world satisfying the
proposition φ.
Dynamic logic starts with two disjoint sets; one denoting atomic propo-
sitions, the other denoting atomic actions. The set of action expressions is
then defined to be the smallest set A containing the atomic actions such that
if α, β ∈ A, then α ∨ β ∈ A and α; β ∈ A.9 The set of propositions is defined
as usual, with the addition that it is assumed that if α is an action expres-
sion and φ a proposition, then hαiφ is a proposition as well. To account
for permission sentences we will assume that in that case also Per(α) is a
proposition.
Propositions are just true or false in a world. To interpret the action
expressions, it is easiest to let them denote pairs of worlds. The mapping
τ gives the interpretation of atomic actions. The mapping τ is extended to
give interpretations to all action expressions by τ(α; β) = τ(α); τ(β) and
τ(α ∨ β) = τ(α) ∪ τ(β). The action α; β consists of executing first α, and
then β. The action α ∨ β can be performed by executing either α or β. We
write τw (α) for the set {v ∈ W | hw, vi ∈ τ(α)}. Thus, τw (α) is the set of all
worlds you might end up in after performing α in w. We will say that Per(α)
is true in w, w î Per(α),10 just in case τw (α) ⊆ Pw , where Pw is the set of
8 There is yet another way to go, which recently became popular as well (e.g., Portner 2007):
assume that permission applies to an action, but assume that a permission statement also
changes what is permitted. I won’t go into this story here. Another story I won’t go into here
is the resource-sensitive logic approach to free choice permission proposed in Barker 2010,
a paper I became aware of just as the current paper was going to press.
9 I will ignore iteration here.
10 Strictly speaking the definition of î should be relativized to a model, but the model remains
implicit here as throughout the paper.
11:6
permissible worlds in w. Notice that this way of interpreting permissions

gives them a conditional flavor: Per(α) really means that it is acceptable to
perform α.11 Given the interpretation of disjunctive actions, it immediately
follows that we can account for free choice permission: from the truth of
Per(α ∨ β) we can infer the truth of Per(α) and Per(β).12
Although free choice permission follows, one wonders whether it should
be built into the semantics: if I allow you to do α this doesn’t mean that I
allow you to do α in any way you want. I only allow you to do α in the best
way. To account for this latter rider, we can add to our models a selection
function, f , that picks out the best elements of any set of possible worlds X
for every world w. Then we say that ♦α is true in w iff fw (τw (α)) ⊆ Pw . But
even if fw (X ∪Y ) ⊆ fw (X)∪fw (Y ), it is still not guaranteed that fw (X ∪Y ) =
fw (X) ∪ fw (Y ), and thus the free choice permission inference isn’t either. Of
course, the inference follows in case fw (α ∨ β) = fw (τw (α)) ∪ fw (τw (β)),
but we would like to have a pragmatic explanation of why this should be the
case if an assertion of the form ♦(α ∨ β) is given.
2.2.2 A performative analysis
Lewis (1979) and Kamp (1973, 1979) have proposed a performative analysis
of command and permission sentences involving a master and his slave. On
their analysis, such sentences are not primarily used to make true assertions
about the world, but rather to change what the slave is obliged/permitted to
do.13 But how will permission sentences govern the change from the prior
permissibility set, Π, to the posterior one, Π0 ? Kamp (1979) proposes that
this change depends on a reprehensibility ordering, ≤, on possible worlds.
The effect of allowing φ is that the best φ-worlds are added to the old
permissibility set to figure in the new permissibility set. This set will be
∗
denoted as Πφ is and defined in terms of the relation ≤ as follows:
∗ def
(7) Πφ = {u ∈ φ | ∀v ∈ φ : u ≤ v}
Thus, the change induced by the permission You may do φ is that the new
∗
permission set, Π0 , is just Π ∪ Πφ . Note that according to this performative
account it does not follow that for a permission sentence of the form You
11 See Asher & Bonevac 2005 for a conditional analysis of permissions sentences.
12 Notice also that another paradox of standard deontic logic is avoided now: from the
permission of α, Per(α), the permission of α ∨ β, Per(α ∨ β) doesn’t follow.
13 For further discussion of this model, see e.g, van Rooij 2000.
11:7
Robert van Rooij
may do φ or ψ the slave can infer that according to the new permissibility set
he is allowed to do any of the disjuncts. Still, in terms of Kamp’s analysis we
can give a pragmatic explanation of why disjuncts are normally interpreted
in this “free choice” way. To explain this, let me first define a deontic
preference relation between propositions, ≺, in terms of our reprehensibility
relation between worlds, <. We can say that although both φ and ψ are
incompatible with the set of ideal worlds, φ is still preferred to ψ, φ ≺ ψ,
iff the best φ-worlds are better than the best ψ-worlds, ∃v ∈ φ and
∀u ∈ ψ : v < u. Then we can say that with respect to ≺, φ and ψ are
equally reprehensible, φ ≈ ψ, iff φ ψ and ψ φ. It is easily seen that
∗ ∗ ∗
Πφ∨ψ = Πφ ∪ Πψ iff φ ≈ ψ. How can we now explain the free choice effect?
According to a straightforward suggestion, a disjunctive permission can
only be made appropriately in case the disjuncts are equally reprehensible.14
This suggestion, of course, exactly parallels the earlier suggestions of when
conditionals with disjunctive antecedents can be used appropriately, or
disjunctive permissions according to the dynamic logic approach. Like these
earlier suggestions, however, this new suggestion by itself is rather ad hoc,
and one would like to provide a “deeper” explanation in terms of more
general principles of pragmatic reasoning.
3 Pragmatic interpretation
3.1 The standard received view
Implicatures come in many varieties, but scalar implicatures have received

the most attention by linguists. A standard way to account for the scalar
implicatures of ‘φ’ is to assume that φ is associated with a set of alternatives,
A(φ), and that the assertion of φ implicates that all its stronger alternatives
are false.
(8) Prag(φ) = {w ∈ φ | ¬∃ψ ∈ A(φ) : w ∈ ψ & ψ ⊂ φ}.
If the alternative of Some of the students passed is All of the students passed,
the desired scalar implicature is indeed accounted for. McCawley (1993)
noticed, however, that if one scalar item is embedded under another one — as
14 For an alternative proposal using this framework, see van Rooij 2006.
11:8
in (9)15 — an interpretation rule like Prag does not give rise to the desired
prediction that only one student passed if the alternatives are defined in the
traditional way.
(9) Alice passed or (Bob passed or Cindy passed).
This observation can be straightforwardly accounted for if we adopt a differ-

ent pragmatic interpretation rule and a different way to determine alterna-
tives. First, we assume that the set of alternatives includes { Alice passed, Bob
passed, Cindy passed } (which should perhaps be closed under conjunction
and disjunction). According to the new pragmatic interpretation rule Exh, w
is compatible with the pragmatic interpretation of φ iff (i) φ is true in w,
and (ii) there is no other world v in which φ is true where less alternatives
in A(φ) are true than are true in w, see (10). In the following, we abbreviate
the condition ∀ψ ∈ A(φ) : v ∈ ψ ⇒ w ∈ ψ by v ≤A(φ) w, and define
v <A(φ) w in terms of this in the usual way.
(10) Exh(φ) = {w ∈ φ | ¬∃v ∈ φ : v <A(φ) w}.
The pragmatic interpretation rule Exh correctly predicts that from (9) we can
pragmatically infer that only one of Alice, Bob, and Cindy passed. In fact, this
pragmatic interpretation rule is better known as the exhaustive interpretation
of a sentence (e.g., Groenendijk & Stokhof 1984, van Rooij & Schulz 2004,
Schulz & van Rooij 2006, Spector 2003, 2006). By interpreting sentences
exhaustively one can account for many conversational implicatures. But
from a purely Gricean point of view, the rule is too strong. All that the
Gricean maxims seem to allow us to conclude from a sentence like Some of
the students passed is that the speaker does not know that All of the students
passed is true; not the stronger proposition that the latter sentence is false.
To account for this intuition, the following weaker interpretation rule, Grice,
can be stated, which talks about knowledge rather than facts (where Kφ
means that the speaker knows φ):16
(11) Grice(φ) = {w ∈ Kφ | ∀v ∈ Kφ, ∀ψ ∈ A(φ) : w î Kψ → v î

Kψ}
15 Landman (2000) and Chierchia (2004) discuss structurally similar examples like Mary is
either working at her paper or seeing some of her students.
16 A similar weaker interpretation is given by Sauerland (2004).
11:9
Robert van Rooij
As shown by Spector (2003) and van Rooij & Schulz (2004), exhaustive inter-
pretation follows from this, if we assume that the speaker is as competent as
possible insofar as this is compatible with Grice.
3.2 The problem
Although these interpretation rules account for many conversational impli-

catures, they give rise to the wrong predictions for more complex statements
involving disjunction. Two prime examples are (i) free choice permissions
of the form ♦(φ ∨ ψ), and (ii) conditionals with disjunctive antecedents like
(φ ∨ ψ) > χ. It is widely held that the alternatives of these sentences are
respectively ♦φ, ♦ψ and ♦(φ ∧ ψ), and φ > χ, ψ > χ and (φ ∧ ψ) > χ.
Before we can discuss the possible pragmatic interpretations, let us first
note that according to standard deontic logic ♦(φ ∨ ψ) î ♦φ ∨ ♦ψ,17 and
that adopting the Lewis/Stalnaker analysis of conditionals, it holds that
(φ ∨ ψ) > χ î (φ > χ) ∨ (ψ > χ).
Let us first look at the standard pragmatic interpretation rule Prag. Given
that ♦φ and ♦ψ express stronger propositions than ♦(φ ∨ ψ), it immediately
follows that Prag(♦(φ ∨ ψ)) = , which is obviously wrong. Let’s turn
then to exhaustive interpretation. We take the only relevantly different
worlds in which ♦(φ ∨ ψ) are true to be {u, v, w}, where ♦φ is true in u
and w, and ♦ψ is true in v and w. Recall that Exh(φ) holds in worlds in
which as few as possible alternatives to φ are true. But this means that
Exh(♦(φ ∨ ψ)) = {u, v}, from which we can wrongly conclude that only one
of the permissions is true. The desired conclusion that both permissions are
true is incompatible with this pragmatic interpretation. A similar story holds
for conditionals with disjunctive antecedents.
Let us turn now to the weaker Gricean interpretation Grice. This weaker
Gricean rule indeed predicts an interpretation that the sentences in fact have.
For the disjunctive permission ♦(φ ∨ ψ) it is predicted that neither ♦φ nor
♦ψ are known to be true, but that they both are possibly true, perhaps even
together. This prediction is appealing, but strengthening this by assuming
our earlier form of competence doesn’t give rise to the desired conclusion:
the resulting exhaustive interpretation gives rise to the wrong prediction.
Perhaps this just means that the set of alternatives is chosen wrongly, or
∗ ∗ ∗
17 Similarly, fw (τw (α ∨ β)) ⊆ fw (τw (α)) ∪ fw (τw (β)) and Πφ∨ψ ⊆ Πφ ∪ Πψ .
11:10
that the competence assumption is formalized in the wrong way. Indeed, this
was proposed by Schulz (2003, 2005) to account for free choice permissions.
As for the latter, she took a speaker to be competent in case she knows of
each alternative whether it is true. Second, she took the set of alternatives
of ♦φ to be the set {ψ : ψ ∈ A(φ)} ∪ {¬ψ : ψ ∈ A(φ)}.18 First, notice
that by applying Grice to a sentence of the form ♦(φ ∨ ψ) it immediately
follows that the speaker knows neither ¬φ nor ¬ψ, in formulas, ¬K¬φ
and ¬K¬ψ. What we would like is that from here we derive the free choice
reading: ♦φ and ♦ψ, which would follow from K¬¬φ and K¬¬ψ. Of
course, this doesn’t follow yet, because it might be that the speaker does not
know what the agent may or must do.19 But now assume that the speaker is
competent on this in Schulz’ sense. Intuitively, this means that Pφ ≡ Kφ
and P♦φ ≡ K♦φ. Remember that after applying Grice, it is predicted that
neither K¬φ nor K¬ψ holds, which means that P¬¬φ and P¬¬φ have
to be true. The latter, in turn, are equivalent to P♦φ and P♦ψ. By competence
we can now immediately conclude to K♦φ and K♦ψ, from which we can
derive ♦φ and ♦ψ as desired, because knowledge implies truth.20
Although I find this analysis appealing, it is controversial, mainly because
of her choice of alternatives. This also holds for other proposed pragmatic
analyses to account for free choice permissions, such as, for example, that of
Kratzer & Shimoyama (2002). In section 4 I will discuss some other possible
analyses that explain the desired free choice inference that assume that the
alternatives of (φ ∨ ψ) > χ and ♦(φ ∨ ψ) are φ > χ and ψ > χ, and ♦φ and
♦ψ, respectively.
18 Taking φ as an alternative is natural to infer from ♦φ to the falsity of this necessity
statement.
19 Notice, though, that this inference does follow if ‘’ and ‘♦’ stand for epistemic must and
epistemic might. This is so, because for the epistemic case we can safely assume that the
speaker knows what he believes, which can be modeled by taking the epistemic accessibility
relation to be fully introspective. This gives the correct predictions, because from Katrin
might be at home or at work, it intuitively follows that, according to the speaker, Katrin
might be at home, and that she might be at work (cf., Zimmermann 2000).
20 Notice that it is also Schulz’ reasoning and notion of competence for Anna ate all of the
cookies that is used to explain why from (2-a) we conclude to (2-b).
11:11
Robert van Rooij
4 Taking both directions into account
4.1 The intuition21
Suppose we adopt a Stalnaker/Lewis style analysis of conditional sentences.

In that case we have to assume a selection function f , to evaluate the truth-
value of the sentence. Take now a set of worlds in which (φ ∨ ψ) >
χ = {u, v, w} such that (i) fu (φ ∨ ψ) = fu (φ) ⊆ χ and fu (ψ) 6⊆ χ,
(ii) fv (φ) 6⊆ χ and fv (φ ∨ ψ) = fv (ψ) ⊆ χ, and (iii) fw (φ ∨ ψ) =
fw (φ) ∪ fw (ψ) ⊆ χ.22 We would like to conclude via pragmatic reasoning
that the speaker who asserted (φ ∨ ψ) > χ implicated that we are in world
w. In that case both φ > χ and ψ > χ are true as well, and we derived the
“conjunctive” interpretation of the conditional with a disjunctive antecedent.
The reasoning will go as follows. First, we are going to assume that the
speaker is competent: she knows in which world she is. It seems unreasonable
that she is in u, because otherwise the speaker could have used an alternative
expression, φ > χ, which (limiting ourselves to worlds in which (φ ∨ ψ) > χ
is true) more accurately singles out {u} than (φ ∨ ψ) > χ does. For the
same reason we can conclude that the speaker is not in world v. In the only
other case, w, fw (φ ∨ ψ) = fw (φ) ∪ fw (ψ) ⊆ χ, and thus both φ > χ
and ψ > χ are true. Of course, one might wonder whether also this state
cannot be expressed more economically by an alternative expression. But
the answer to this will be negative, because we have already assumed that
(φ > χ) ∧ (φ > χ) is not an alternative to (φ ∨ ψ) > χ. Thus, (φ ∨ ψ) > χ
21 The intuition of the following solution I owe to Franke (2009). One way of working out this
intuition will be somewhat different, though, from what Franke proposed. This way makes
use of bidirectional optimality theory. Earlier accounts making use of Bi-OT include Sæbø
2004 and Aloni 2007. What I always found problematic about such earlier Bi-OT solutions
(I was a co-author of an earlier version of Aloni 2007) is that complexity of alternative
expressions was taken to play a crucial role. But explanations based on complexity are not
always equally convincing. Following Franke 2009, I believe that making use of complexity is
not required. At a 2009 conference in Leuven where I presented Bi-OT and game-theoretic
“solutions” of the problem of free choice inferences, Bart Geurts presented a solution that
was based on a similar intuition (I am not sure in how far complexity played a crucial role
here, or not), to be presented in Geurts 2010. I believe that also Edgar Onea suggested a
solution very much in the same spirit. Perhaps this should be taken as an indication how
natural a solution in this spirit is.
22 It might seem that I wrongly assume that φ > χ î (φ ∨ ψ) > χ and ψ > χ î (φ ∨ ψ) > χ.
This is not, and should not, be the case. It might well be, for instance, that φ > χ is true in
w, but (φ ∨ ψ) > χ is not. However, our reasoning will not depend on such worlds, because
we will only consider worlds in which (φ ∨ ψ) > χ is true.
11:12
pragmatically entails φ > χ and ψ > χ, because if not, the speaker could
have used an alternative expression which more accurately singled out the
actual world.
Intuitive solutions are ok, but to test them, we have to make them precise.
In the following I will suggest two ways to implement the above intuition.
Both implementations are based on the idea that to account for the desired
“conjunctive” inferences of the disjunctive sentences, alternative expres-
sions and alternative worlds/interpretations must play a very similar role
in pragmatic interpretation. Thinking of it in somewhat different terms,
we should take seriously both the speaker’s and the hearer’s perspective.
Fortunately, there are two well-known theories on the market that look at
pragmatic interpretation from such a point of view: Bi-directional Optimality
Theory (e.g., Blutner 2000), and Game Theory (e.g., Benz, Jäger & van Rooij
2005). In the following I will discuss two possible ways to proceed, but they
have something crucial in common: both ways make use of different levels
of interpretation. The first proposal is game-theoretic in nature, and due
to Franke (2010, 2009). The second suggestion is a less radical departure
from the “received view” in pragmatics, and is more in the spirit of Bi-OT.
It makes crucial use of exhaustive interpretation and of different levels of
interpretation, but like in Bi-OT, alternative worlds and expressions that
initially played a role in interpretation need not play a role anymore at higher
levels.23
4.2 Franke’s game-theoretic solution
Game-theoretic and optimality-theoretic analyses of conversational implica-

tures seek to account in one systematic way for both scalar implicatures and
for implicatures involving marked and unmarked meanings/interpretations,
inspired by Horn’s division of pragmatic labor. In order to do so, they as-
sociate with an expression not just a semantic meaning, but assign also
probabilities. According to the most straightforward proposal, the proba-
1
bility of w given φ, P (w | φ) = card(φ) if w ∈ φ, 0 otherwise. Recall
that according to one standard approach pragmatic interpretation works as
follows:
(8) Prag(φ) = {w ∈ φ | ¬∃ψ ∈ A(φ) : w ∈ ψ & ψ ⊂ φ}.

23 For the exact relation between Bi-OT and the game-theoretical best-response dynamics
Franke makes use of, see Franke 2009.
11:13
Robert van Rooij
P
P (w | φ) u v w w P (w | φ)
1 1 1
Some 3 3 3
= 1
1 1
Most 0 2 2
= 1
All 0 0 1 = 1
Figure 1 P (w | φ) for standard scale
P
P (w | f ) uφ≺ψ vψ≺φ wφ≈ψ w P (w | f )
1 1
φ>χ 2
0 2
= 1
1 1
ψ>χ 0 2 2
= 1
1 1 1
(φ ∨ ψ) > χ 3 3 3
= 1
Figure 2 P (w | f ) for counterfactual
On the assumption that all worlds are equally likely, here is a straightforward
way to reformulate (8) making use of probabilities:
(12) Prag0 (φ) = {w ∈ φ | ¬∃ψ ∈ A(φ) : P (w | ψ) > P (w | φ)}
Look now at a standard example with All = {w} ⊂ Most = {v, w} ⊂

Some = {u, v, w}.24 From the assertion Some the desired implicature
immediately follows, as can be seen from figure 1.
The idea is that, for instance, Most is pragmatically interpreted as {v},
because (i) there is no world in which Most gets a higher value, and (ii) in v it
is best to utter Most, because for all alternatives ψ, P (v | ψ) < P (v | Most).
Let us now do the same for the sentence (φ ∨ ψ) > χ, together with its
alternatives. Suppose that if we have in the columns the alternative worlds
(with uφ≺ψ standing for the world where the best φ-worlds are closer to u
than any ψ-world), and that we assume that χ is true in the most similar
worlds (but not in others). In that case we get figure 2 (where f is an arbitrary
form, or expression).
24 With All abbreviating All Ps are Qs.
11:14
A number of things are worth remarking. First of all, all sentences are
true in wφ≈ψ . As a result of this, a (naive) hearer will interpret, for instance,
φ > χ as equally likely true in uφ≺ψ as in wφ≈ψ . Now take the speaker’s
perspective. Which statement would, or should, she make given that she is in
a particular situation, or world? Naturally, that statement that gives her the
highest chance that the (naive) hearer will interpret the message correctly.
Thus, she should utter that sentence which gives the highest number in the
column. But this means that in uφ≺ψ she should (and rationally would) utter
φ > χ, in vψ≺φ she should utter ψ > χ, and in wφ≈ψ it doesn’t matter what
she utters, both are equally good. The boxed entries model this speaker’s
choice. The important thing to note is that according to this reasoning, no
speaker (a speaker in no world) would ever utter (φ ∨ ψ) > χ. Still, this
is exactly the message that was uttered and should be interpreted, so we
obviously missed something.
Franke (2009) proposes that our reasoning didn’t go far enough. We
should now take the hearer’s perspective again, taking into account the
optimal speaker’s message choice given a naive semantic interpretation of the
hearer.25 This can best be represented by modeling the probabilities of the
messages sent according to the previous reasoning, given the situation/world
that the speaker is in.26 How should the hearer now interpret the messages?
Well, because the speaker would always send φ > χ in uφ≺ψ , while the chance
that she sends φ > χ in wφ≈ψ is lower (and taking the a priori probabilities
of the worlds to be equal), there is a higher chance that the speaker of φ > χ
is in world uφ≺ψ than in wφ≈ψ , and thus the hearer will choose accordingly.
This is represented by the boxed entry in figure 3 (in which P (f | w) stands
for the probability with which the speaker would say f if she were in w).
Something similar holds for ψ > χ. As for (φ ∨ ψ) > χ, it is clear that all
worlds are equally likely now, given that a previous speaker would not make
this utterance in any of those worlds.
Having specified how such a more sophisticated hearer would interpret
the alternative utterances, we turn back to the speaker, but now assume that
the speaker takes such a more sophisticated hearer into account. First we fill
in the probabilities of the worlds, given the previous reasoning. Notice that
these probabilities are crucially different from the earlier P (w | f ). The
speaker now chooses optimally given these probabilities: i.e., the speaker
25 Jäger & Ebert (2009) make a similar move. Both models are instances of Iterated Best
Response (IBR) models.
26 For a more precise description, the reader should consult Franke 2009, obviously.
11:15
Robert van Rooij
P (f | w) uφ≺ψ vψ≺φ wφ≈ψ

1
φ>χ 1 0 2
1
ψ>χ 0 1 2
(φ ∨ ψ) > χ 0 0 0
P
f P (f | w) 1 1 1
Figure 3 P (f | w) for 1st -level hearer
P
P (w | f ) uφ≺ψ vψ≺φ wφ≈ψ w P (w | f )
φ>χ 1 0 0 = 1
ψ>χ 0 1 0 = 1
1 1 1
(φ ∨ ψ) > χ 3 3 3
= 1
Figure 4 P (w | f ) for 1st -level speaker
chooses (one of) the highest rows in the columns. In uφ≺ψ and vψ≺φ she
would choose as before, but in wφ≈ψ she now chooses (φ ∨ ψ) > χ instead
of either of the others. This is again represented by boxed entries in figure 4.
If we take the hearer’s perspective again, the iteration finally reaches a
fixed point. As illustrated by figure 5, (φ ∨ ψ) > χ is now interpreted by
the even more sophisticated hearer in the desired way. From the truth of
(φ ∨ ψ) > χ, both φ > χ and ψ > χ pragmatically follow.
Franke (2010, 2009) shows that by exactly the same reasoning free choice
permissions are accounted for as well.27 What is more, using exactly the same
machinery he can even explain (by making use of global reasoning) why we
infer from (φ ∨ ψ) > χ and ♦(φ ∨ ψ) that the alternatives (φ ∧ ψ) > χ and
♦(φ ∧ ψ) are not true, inferences that are sometimes taken to point to a local
analysis of implicature calculation.
27 Franke uses standard deontic logic, but that doesn’t seem essential. Starting with one of the
two more dynamic approaches, he could explain the free choice inference as well using a
very similar reasoning.
11:16
P (f | w) uφ≺ψ vψ≺φ wφ≈ψ
φ>χ 1 0 0
ψ>χ 0 1 0
(φ ∨ ψ) > χ 0 0 1
P
f P (f | w) 1 1 1
Figure 5 P (f | w) for 2nd -level hearer
4.3 A Bidirectional-like solution
Is there any relation between the above game-theoretic reasoning and the
“received” analysis making use of pragmatic interpretation rule (8) or that of
exhaustive interpretation, (10)? I will suggest that a “bidirectional” received
view is at least very similar to Franke’s proposal sketched above, and does
the desired work as well.
In the above explanation, we started with looking at the semantic inter-
pretation from the hearer’s point of view. This way of starting things was
motivated by pragmatic interpretation rule (8):
(8) Prag(φ) = {w ∈ φ | ¬∃ψ ∈ A(φ) : w ∈ ψ & ψ ⊂ φ}.
But we could have started with the pragmatic interpretation rule (10) as well.
(10) Exh(φ) = {w ∈ φ | ¬∃v ∈ φ : v <A(φ) w}.
In that case we wouldn’t have started from the hearer’s, but rather from the
speaker’s point of view. Also this would have given rise to a reformulation
and a table, but now the probability function, P (ψ | w), gives the probabilities
with which the speaker would have used the alternative expression ψ given
the world w she is in. The naive assumption now is that P (ψ | w) is simply
1
card({χ∈A(φ) : wîχ})
, if w î ψ, and 0 otherwise. The reformulation now looks as
follows:
0
(13) Exh (φ) = {w ∈ φ | ¬∃v : P (φ | v) > P (φ | w)}.
For the simple scalar implicature, the table to start with from a naive speaker’s
11:17
Robert van Rooij
P (φ | w) u v w
1 1
Some 1 2 3
1 1
Most 0 2 3
1
All 0 0 3
P
f P (f | w) 1 1 1
Figure 6 P (φ | w) for standard scale
P (f | w) u v w
1 1
♦φ 2
0 3
1 1
♦ψ 0 2 3
1 1 1
♦(φ ∨ ψ) 2 2 3
P
f P (f | w) 1 1 1
Figure 7 P (f | w) for 0th -level speaker
point of view is given in figure 6:

Though the way of choosing would be different (it is the hearer now who
chooses the column with the highest number), the result would be exactly
the same. What would be the beginning table for our problematic sentence
♦(φ ∨ ψ)? It is given in figure 7.
Just as we derived using the rule of exhaustive interpretation, the first
prediction would be that ♦(φ ∨ ψ) is interpreted as {u, v}. To improve
things, we have to look again at the hearer’s perspective. And, in fact, this
could be done in Franke’s framework, and we end up with exactly the same
desired solution. What this suggests is two things: (i) adopting speaker’s and
hearer’s point of view closely corresponds with pragmatic interpretation rules
(10) and (8), respectively; (ii) to correctly predict the pragmatic interpretation
of ♦(φ ∨ ψ) we have to take both types of interpretation rules into account.
Recall the intuition as expressed in the previous subsection. That rea-
soning corresponded very closely to the following pragmatic interpretation
11:18
rule: Prag∗ (φ) = {w ∈ Exh(φ) | ¬∃ψ ∈ A(φ) : w ∈ ψφ ∧ ψφ ⊂ φ},

where ψφ denotes ψ ∩ φ and ψ is taken not to be an element of
A(φ).28 Notice that this rule is close to interpretation rule (8), with the im-
portant difference that exhaustive interpretation (the speaker’s point of view)
plays an important role. Unfortunately, just as the earlier Prag, also this rule
wrongly predicts that a sentence like ♦(φ ∨ ψ) doesn’t have a pragmatic
interpretation (Prag∗ (♦(φ ∨ ψ)) = ). For this reason we have to iterate,29
although the intuition behind this new rule will remain the same: ♦(φ ∨ ψ)
pragmatically entails ♦φ and ♦ψ, because if not, the speaker could have
used an alternative expression which more accurately singled out the actual
world.
In the following we will abbreviate the condition that ∀ψ ∈ A(φ) : v ∈
ψ ⇒ w ∈ ψ by v ≤A(φ) w as before. If K is the set of worlds in
which the sentence under consideration is true, I will also abbreviate ψ ∩
K by ψK . Moreover, ψ ≺n φ will be an abbreviation for the condition
ψKn ⊂ φKn , if n = 0, and ψKn ⊆ φKn , otherwise. Intuitively, ψ ≺n φ
expresses the fact that at least some worlds of the nth -level interpretation
of φ could be expressed more precisely by alternative expression ψ. We will
make use of the following definitions:30
K def
(14) Exhnn (φ) = {w ∈ φKn | ¬∃v ∈ φKn : v <An (φ) w}.
def K
(15) PragKnn (φ) = {w ∈ Exhnn (φ) | ¬∃ψ ∈ An (φ), w ∈ ψKn & ψ ≺n φ}.
def K
(16) Kn+1 = {w ∈ φKn | w 6∈ Exhnn (φ)}.
def K
(17) An+1 (φ) = {ψ ∈ An (φ) | ¬∃w ∈ Exhnn (φ), w ∈ ψKn & ψ ≺n φ}.
The pragmatic interpretation of φ with respect to set of worlds K and

alternative expressions A(φ), PragK (φ), will now be PragKnn (φ) for the first
K
n such that PragKnn (φ) 6= . If there is no such n, PragK (φ) = Exh0 0 (φ),
where K0 = K and A0 (φ) = A(φ).
28 Notice that if w ∈ Prag∗ (φ), one can think of the pair hφ, wi as — using bidirectional
OT-terminology — a strong optimal form-meaning pair.
29 In OT-terminology, we have to look at the notion of weak optimality.
30 I won’t try to prove this here, but I believe that the analysis would be almost equivalent to
Franke’s game-theoretic approach, if we redefined the definitions of the orderings ‘v <An (φ)
w’ and ‘ψ ≺n φ’ in quantitative rather than qualitative terms as follows: v ≤An (φ) w
iff def card({ψ ∈ A(φ) : v ∈ ψ}) ≤ card({ψ ∈ A(φ) : v ∈ ψ}), and ψ ≺n φ iff def
card(ψK0 ) < card(φKn ), if n = 0, and card(ψKn ) ≤ card(φKn ), otherwise.
11:19
Robert van Rooij
Notice that (14) and (15) are just the straightforward generalizations with
respect to a set of worlds K of standard exhaustive interpretation rule (10)
and pragmatic interpretation rule (8) respectively.
K
(10) Exh (φ) = {w ∈ φK | ¬∃v ∈ φK : v <A(φ) w}.
(8) PragK (φ) = {w ∈ φK | ¬∃ψ ∈ A(φ), w ∈ ψK & ψK ⊂ φK }
The only difference between (14) and (10) is that the relevant set of worlds
and the relevant set of alternatives might depend on earlier stages in the
interpretation. If we limit ourselves to the first interpretation (i.e., level 0),
the two interpretation rules are identical. Similarly for the difference between
(15) and (8): the relevant alternatives depend on earlier stages, and the set of
worlds with respect to which the entailment relation between ψ and φ must
be determined depends on earlier stages as well. Indeed, if we look at the
first interpretation, the only important difference is that (15) takes as input
the exhaustive interpretation of φ, while this is not the case for (8). This
difference implements the view that speaker’s and hearer’s perspective are
both required.
The definitions (16) and (17) determine which worlds and alternative
expressions are relevant for the interpretation at the n + 1th level of inter-
pretation. We start with interpretation 0 (the first interpretation). Notice first
K
that level 1 is only reached in case Prag0 0 (φ) = , i.e., in case for each world
v in the exhaustive interpretation of φ there is an alternative expression ψ
that is true in v and which is stronger than φ. Thus, in that case there is no
world v ∈ Exh(φ) such that φ is at least as specific as any other alternative
that is true in v. For the interpretation φ at level 1 we will not consider
worlds in the 0th -level exhaustive interpretation of φ anymore. This is what
(16) implements. The new set of alternatives determined by (17) are those
elements of the original set of alternatives A0 that did not help to eliminate
worlds in Exh(φ) at the 0th -level of interpretation.
Let us see how things work out for some particular examples. Let us
first look at ♦(φ ∨ ψ) with A(♦(φ ∨ ψ)) = {♦φ, ♦ψ, ♦(φ ∧ ψ)}, and assume
that K = {u, v, w, x}, ♦(φ ∨ ψ) = {u, v, w, x}, ♦φ = {u, w, x}, ♦ψ =
K
{v, w, x}, and ♦(φ∧ψ) = {x}. Observe that Exh0 0 (♦(φ∨ψ)) = {u, v}. But
K
neither u nor v can be an element of Prag0 0 (♦(φ ∨ ψ)), because ♦φK0 ⊂
K
♦(φ ∨ ψ) and ♦ψK0 ⊂ ♦(φ ∨ ψ). It follows that Prag0 0 (♦(φ ∨ ψ)) = .
We continue, and calculate K1 and A1 (♦(φ ∨ ψ)). The new set of worlds
K
we have to consider, K1 , is just K − Exh0 0 (♦(φ ∨ ψ)) = {w, x}. The new
11:20
set of alternatives, A1 (♦(φ ∨ ψ)), is just {♦(φ ∧ ψ)}. Now, we have to

K K
determine Exh1 1 (♦(φ ∨ ψ)) and Prag1 1 (♦(φ ∨ ψ)). Because K1 = {w, x}
and ♦(φ ∧ ψ) is only true in x, both will be {w}. But this means that also
PragK (♦(φ ∨ ψ)) = {w}, and thus that we can pragmatically infer both
♦φ and ♦ψ from the assertion that ♦(φ ∨ ψ), as desired. A very similar
calculation shows that we can pragmatically infer both φ > χ and ψ > χ
from the assertion that (φ ∨ ψ) > χ. What’s more, we have even explained
why we can pragmatically infer from ♦(φ ∨ ψ) that the alternative ♦(φ ∧ ψ)
is not true, just as Franke (2009) could.
These predictions are exactly as desired, but how does our machinery
work for more simple examples, like φ ∨ ψ? Fortunately, it predicts correctly
here as well. First, assume that φ∨ψ = {u, v, w} = K, φ = {u, w}, ψ =
K
{v, w}, and φ ∧ ψ = {w}. Observe that Exh0 0 (φ ∨ ψ) = {u, v}. On the
K0
basis of these facts, we can conclude that Prag0 (φ ∨ ψ) = . This is just the
same reasoning as before. The difference shows up when we go to the next
K
level and determine Prag1 1 (φ ∨ ψ), because now there will be an alternative
left over which plays a crucial role. But first calculate K1 and A1 (φ ∨ ψ):
K
K1 = {w} and A1 (φ ∨ ψ) = {φ ∧ ψ}. Obviously, Exh1 1 (φ ∨ ψ) = {w},
but because w ∈ φ ∧ ψ, it follows that (φ ∧ ψ) ≺1 (φ ∨ ψ), and thus
K
Prag1 1 (φ ∨ ψ) = . It follows that K2 = , from which we can conclude that
K
PragK (φ ∨ ψ) = Exh0 0 (φ ∨ ψ) = {u, v}, as desired.
Let us now see what happens if we look at multiple occurrences of disjunc-
tions: examples like φ ∨ ψ ∨ χ, ♦(φ ∨ ψ ∨ χ), and (φ ∨ ψ ∨ χ) > ξ. First look
at φ ∨ ψ ∨ χ and assume that φ = {w1 , w4 , w5 , w7 }, ψ = {w2 , w4 , w6 , w7 },
K
and χ = {w3 , w5 , w6 , w7 }. Observe that Exh0 0 (φ ∨ ψ ∨ χ) = {w1 , w2 , w3 }.
K
On the basis of these facts, we can conclude that Prag0 0 (φ ∨ ψ ∨ χ) = . If
only the separate disjuncts were alternatives of φ ∨ ψ ∨ χ, it would result
that K1 = {w4 , w5 , w6 , w7 }, which would then also be the inferred pragmatic
interpretation. We have to conclude that thus we need other alternatives as
well. It is only natural to assume that also φ ∧ ψ, φ ∧ χ, ψ ∧ χ, and φ ∧ ψ ∧ χ
are alternatives. In that case K1 is still {w4 , w5 , w6 , w7 }, but now the new set
of alternatives is {φ ∧ ψ, φ ∧ χ, ψ ∧ χ, φ ∧ ψ ∧ χ}, and the resulting pragmatic
K
meaning will be different. In particular, Exh1 1 (φ∨ψ∨χ) = {w4 , w5 , w6 }. How-
K1
ever, none of these worlds remains in Prag1 (φ∨ψ∨χ), because w4 ∈ φ∧ψ
which is a stronger expression than φ ∨ ψ ∨ χ, and similarly for w5 and w6 .
This means we have to go to the next level where K2 = {w7 }. But w7 won’t
K
be in Prag2 2 (φ ∨ ψ ∨ χ), because w7 ∈ φ ∧ ψ ∧ χ = {w7 }. As a result,
K
PragK (φ ∨ ψ ∨ χ) = Exh0 0 (φ ∨ ψ ∨ χ) = {w1 , w2 , w3 }, just as desired.
11:21
Robert van Rooij
What about ♦(φ ∨ ψ ∨ χ), for instance? Once again we have to make a
closure assumption concerning the alternatives. As it turns out, the correct
way to go is also the most natural one: first, A(♦φ) = {♦ψ : ψ ∈ A(φ)},
and second, A(φ ∨ ψ ∨ χ) = {φ, ψ, χ, φ ∧ ψ, φ ∧ χ, ψ ∧ χ, φ ∧ ψ ∧ χ, φ ∨
ψ, φ ∨ χ, ψ ∨ χ}. Thus, at the “local” level, the alternatives are closed under
disjunction as well. Let us now assume that ♦φ = {w1 , w4 , w5 , w7 }, ♦ψ =
{w2 , w4 , w6 , w7 }, and ♦χ = {w3 , w5 , w6 , w7 }. Let’s assume for simplicity
that in none of these worlds any conjunctive permission like ♦(φ ∧ ψ) is
K
true. Observe that Exh0 0 (♦(φ ∨ ψ ∨ χ)) = {w1 , w2 , w3 }. It follows that
K1 = {w4 , w5 , w6 , w7 } and the new set of alternatives is the earlier set minus
K
{♦φ, ♦ψ, ♦χ}. The new exhaustive interpretation will be Exh1 1 (♦(φ ∨ ψ ∨
K
χ)) = {w4 , w5 , w6 }, but all these worlds are ruled out for Prag1 1 (♦(φ∨ψ∨χ))
because of our disjunctive alternatives. This means that we have to go to
the next level. At level 2, the new set of worlds is just {w8 }, which is thus
K
also Exh2 2 (♦(φ ∨ ψ ∨ χ)). World w8 cannot be eliminated by a more precise
K
alternative, which means that also Prag2 2 (♦(φ ∨ ψ ∨ χ)) = {w8 }, which is
what PragK (♦(φ ∨ ψ ∨ χ)) will then denote as well. Notice that in w8 it holds
that all of ♦φ, ♦ψ, and ♦χ are true: the desired free choice inference. Similar
reasoning applies to (φ ∨ ψ ∨ χ) > ξ.
These calculations have made clear that to account for free choice per-
mission, we have to make use of exhaustive interpretation several times. In
this sense it is similar to the analysis proposed by Fox (2007). Still, there
are some important differences. One major difference is that Fox (2007)
exhaustifies not only the sentence that is asserted, but also the relevant
alternatives. Moreover, Fox uses exhaustification to turn alternatives into
other alternatives, thereby “syntacticising” the process. We don’t do anything
like this, and therefore feel that what we do is more in line with the Gricean
approach. Exhaustification always means looking at “minimal” worlds: we
don’t change the alternatives. The worst that can happen to them is that
they are declared not to be relevant anymore to determine the pragmatic
interpretation.
Notice that our analysis also immediately explains why it is appropriate to
use any under ♦, but not under : whereas ♦(φ∨ψ∨χ) pragmatically entails
♦(φ∨χ), (φ∨ψ∨χ) does not pragmatically entail (φ∨χ). It is easy to see
that our analysis can account for the “free choice” inference of the existential
sentence as well: that from Several of my cousins had cherries or strawberries
we naturally infer that some of the cousins had cherries and some had
11:22
strawberries.31 In formulas, from ∃x(P x ∧ (Qx ∨ Rx)) we can pragmatically

infer that both ∃x(P x ∧ Qx) and ∃x(P x ∧ Rx) are true. But this shows that
yet another “paradoxical” conjunctive reading of disjunctive sentences can
be accounted for as well.32 If we analyze comparatives as proposed by Larson
(1988), for instance, it is predicted that John is taller than Mary or Sue should
be represented as something like ∃d[d(T )(j)∧(¬d(T )(m)∨¬d(T )(s))], with
d a measure function from (denotations of) adjectives to sets of individuals.
Pragmatically we can infer from this that John is taller than Mary and that
John is taller than Sue.
Chemla (2009) argued that sentences of the form ∀x♦(P x ∨ Qx) give
rise to inferences that are more problematic to account for by globalist
approaches towards conversational implicatures than by localist approaches.
He found that people inferred from Everybody is allowed to take Algebra
or Literature that everybody can choose which of the two they will take.
This suggests that in general we infer from ∀x♦(P x ∨ Qx) both ∀x♦P x
and ∀x♦Qx. In their commentary article, Geurts & Pouscoulous (2009b)
suggested that the observed “conjunctive” inference might very well depend
on the particular construction being used, however, and thus be less general
than predicted by a localist approach. Moreover, they suggest that universal
permission sentences are just summaries of permissions of the form ♦(φ ∨
ψ) made to multiple addressees, in which case the data can be explained
by any global analysis that can explain standard free choice permissions.
I don’t know what is the appropriate analysis of these inferences. I can
point out, however, what we would have to add to our analysis to account
for the conjunctive interpretation. If this conjunctive interpretation really
depends on the particular construction being used (as suggested by Geurts
and Pouscoulous), then it would be wise not to make use of this extra addition.
As it turns out, our approach predicts the conjunctive interpretation if
we include ∃x♦(P x ∨ Qx) among the alternatives, and we exchange in the
definition of ψ ≺ φn the notion ψKn by the pragmatic interpretation of ψ,
PragK (ψ).33 The crucial step in this case is the one in which a minimal world
31 I believe that Nathan Klinedinst and Regine Eckhardt were the first to observe that these
inferences should go through. Perhaps it should be pointed out that Schulz (2003) could
straightforwardly account for these inferences as well.
32 This observation is due to Krasikova (2007), though she uses Fox’s analysis of free choice
inferences.
33 Thus, ψ ≺n φ will be an abbreviation for the condition ψKn ⊂ φKn , if n = 0, and
PragK (ψ) ⊆ φKn , otherwise, where PragK (ψ) is, as before, PragK
n (ψ) for the first n such
n
that PragKn
n
(ψ) =
6
11:23
Robert van Rooij
where ∃x♦P x and ∃x♦Qx are true but both ∀x♦P x and ∀x♦Qx false is
eliminated, because such a world could be more accurately expressed (given
the truth of ∀x♦(P x ∨ Qx)) by the alternative ∃x♦(P x ∨ Qx). While the
inclusion of ∃x♦(P x ∨Qx) among the alternatives of ∀x♦(P x ∨Qx) is not a
significant change to our framework, it has to be admitted that the exchange
of the notion ψKn by the pragmatic interpretation of ψ is significant. From
an intuitive point of view, the effect of this exchange would be that we do
not only look at the exhaustive interpretation of φ, the sentence asserted,
but also at the exhaustive interpretations of the alternatives. As a result, our
analysis would become much closer to the proposal of Fox (2007). But, as
mentioned above, if we were to adopt the suggestion of Geurts & Pouscoulous
(2009b), this would, in fact, not be the way to go.
5 Conclusion
The papers of Geurts & Pouscoulous (2009a) and Chemla (2009) provide
strong empirical evidence that sentences in which a trigger of a scalar impli-
cature occurs under a universal does not in general give rise to an embedded
implicature. This evidence favors a globalist analysis of conversational impli-
catures over its localist alternative. As far as I know, it is uncontroversial that
triggers occurring under an existential do give rise to implicatures. In this
paper, and following Franke (2010, 2009), I discussed some ways in which
these challenging examples for a “globalist” analysis of conversional impli-
catures could be given a principled global pragmatic explanation after all. I
suggested how potentially problematic examples for our global pragmatic
analysis of the form ∀x♦(P x ∨ Qx), as discussed by Chemla (2009), could
be treated as well. At least two things have to be admitted, though. First,
our global analysis still demands that the alternatives are calculated locally.
I don’t think this is a major concession to localists. Second, according to
Zimmermann (2000), even a disjunctive permission of the form You may do
φ or you may do ψ gives rise to the free choice inference, and according
to Merin (1992) a conjunctive permission of the form You may do φ and ψ
allows the addressee to perform only φ. I have no idea how to pragmati-
cally account for those intuitions without reinterpreting the semantics of
conjunction as well as disjunction. If our analysis is acceptable, it points to
the direction in which richer pragmatic theories have to go: (i) we have to
take both the speaker’s and the hearer’s perspective into account, and (ii)
one-step inferences (or strong Bi-OT) are not enough, more reasoning has to
11:24
be taken into account (i.e., weak Bi-OT, or iteration). These are what I take to
be the main messages of this paper.
References
Aloni, Maria. 2007. Expressing ignorance or indifference: Modal implicatures

in Bi-directional OT. In Balder ten Cate & Henk Zeevat (eds.), Logic,
Language, and Computation: 6th International Tbilisi Symposium on Logic,
Language, and Computation (TbiLLC 2005) (Lecture Notes in Computer
Science 4363), 1–20. Berlin & Heidelberg: Springer. doi:10.1007/978-3-540-
75144-1.
Asher, Nicholas & Daniel Bonevac. 2005. Free choice permission is strong
permission. Synthese 145(3). 303–323. doi:10.1007/s11229-005-6196-z.
Barker, Richard. 2010. Free choice permission as resource-sensitive reasoning.
Benz, Anton, Gerhard Jäger & Robert van Rooij. 2005. Games and pragmatics
(Palgrave Studies in Pragmatics, Language and Cognition). Houndmills,
Basingstoke & Hampshire: Palgrave Macmillan.
Chemla, Emmanuel. 2008. Similarity: Towards a unified account of scalar
implicatures, free choice permission and presupposition projection. Ms,
Ecole Normale Supérieure & MIT. http://www.semanticsarchive.net/
Archive/WI1ZTU3N/Chemla-SIandPres.html.
Chemla, Emmanuel. 2009. Universal implicatures and free choice effects: Ex-
Chierchia, Gennaro. 2004. Scalar implicatures, polarity phenomena, and
the syntax/pragmatics interface. In Adriana Belletti (ed.), Structures
and beyond (Oxford Studies in Comparative Syntax: The Cartography
of Syntactic Structures 3), 39–103. Oxford: Oxford University Press.
Chierchia, Gennaro. 2006. Broadening your views: Implicatures of domain
widening and the “logicality” of language. Linguistic Inquiry 37(4). 535–590.
doi:10.1162/ling.2006.37.4.535.
Fine, Kit. 1975. Critical notice of Lewis 1973. Mind, New Series 84(335).
451–458. doi:10.1093/mind/LXXXIV.1.451.
Fox, Danny. 2007. Free choice and the theory of scalar implicatures. In Uli
Sauerland & Penka Stateva (eds.), Presupposition and implicature in com-
11:25
Robert van Rooij
positional semantics (Palgrave Studies in Pragmatics, Language and Cogni-

tion), 71–120. Houndmills, Basingstoke & Hampshire: Palgrave MacMillan.
Franke, Michael. 2009. Signal to act: Game theory in pragmatics. Ams-
terdam: University of Amsterdam dissertation. http://www.illc.uva.nl/
Publications/Dissertations/DS-2009-11.text.pdf.
Franke, Michael. 2010. Free choice from iterated best response. In Maria
Aloni, Katrin Schulz, Harald Bastiaanse & Tikitu de Jager (eds.), Logic,
Language and Meaning: 17th Amsterdam Colloquium (Lecture Notes in
Computer Science 6042), 267–276. Berlin and Heidelberg: Springer. To
appear.
Geurts, Bart. 2010. Quantity implicatures. Cambridge: Cambridge University
Press. To appear.
Geurts, Bart & Nausicaa Pouscoulous. 2009b. Free choice for all: A re-
doi:10.3765/sp.2.5.
Groenendijk, Jeroen & Martin Stokhof. 1984. Studies in the semantics of
questions and the pragmatics of answers. Amsterdam: University of
Amsterdam dissertation. http://dare.uva.nl/record/123669.
Harel, David. 1984. Dynamic logic. In Dov Gabbay & Franz Guenthner (eds.),
Handbook of philosophical logic, vol. 2, 497–604. Dordrecht: D. Reidel.
Jäger, Gerhard & Christian Ebert. 2009. Pragmatic rationalizability. In
Arndt Riester & Torgrim Solstad (eds.), Sinn und Bedeutung (SuB13)
(SinSpecC 5), 1–15. Stuttgart. http://www.ims.uni-stuttgart.de/projekte/
sfb-732/sinspec/sub13/jaegerEbert.pdf.
Society, New Series 74. 57–74.
Kamp, Hans. 1979. Semantics versus pragmatics. In Franz Guenther &
Siegfried Schmidt (eds.), Formal semantics and pragmatics for natural
languages (Studies in Linguistics and Philosophy 4), 255–287. Berlin &
Heidelberg: Springer.
Kanger, Stig. 1981. New foundations for ethical theory. In Risto Hilpinen (ed.),
New studies in deontic logic: Norms, actions and the foundations of ethics
(Synthese Library 152), 36–58. Berlin & Heidelberg: Springer.
Krasikova, Sveta. 2007. Quantification in than-clauses. In Maria Aloni, Paul
Dekker & Floris Roelofsen (eds.), Sixteenth Amsterdam Colloquium, 133–
138. Amsterdam: ILLC. doi:10.1.1.156.7902.
11:26
Kratzer, Angelika & Junko Shimoyama. 2002. Indeterminate pronouns: The

view from Japanese. In Yukio Otsu (ed.), The Third Tokyo Conference on
Psycholinguistics (TCP3), 1–25. Tokyo: Hituzi Syobo.
Landman, Fred. 2000. Events and plurality: The Jerusalem lectures. Dordrecht:
Kluwer.
Larson, Richard. 1988. Scope and comparatives. Linguistics and Philosophy
11(1). 1–26. doi:10.1007/BF00635755.
Leibniz, Gottfried. 1930. Elementa iuris naturalis. In Preussische Akademie
der Wissenschaften (ed.), Gottfried Wilhelm Leibniz: Sämtliche Schriften
und Briefe. Sechste Reihe: Philosophische Schriften, vol. 1, 431–485. Darm-
stadt: Otto Reichl Verlag.
Levinson, Stephen. 2000. Presumptive meanings: The theory of generalized
conversational implicature. Cambridge: MIT Press.
Lewis, David. 1973. Counterfactuals. Oxford: Blackwell.
Lewis, David. 1979. A problem about permission. In Esa Saarinen, Risto
Hilpinen, Ilkka Niiniluoto & Merril Provence Hintikka (eds.), Essays in
honor of Jaakko Hintikka: On the occasion of his fiftieth birthday on
January 12, 1979, 163–175. Dordrecht: D. Reidel.
McCawley, James. 1993. Everything that linguists always wanted to know
about logic∗ . Chicago: The University of Chicago Press 2nd edn.
Merin, Arthur. 1992. Permission sentences stand in the way of Boolean
and other lattice-theoretic semantics. Journal of Semantics 9(2). 95–152.
doi:10.1093/jos/9.2.95.
Portner, Paul. 2007. Imperatives and modals. Natural Language Semantics
14(4). 351–383. doi:10.1007/211050-070-9022-y.
van Rooij, Robert. 2000. Permission to change. Journal of Semantics 17(2).
119–143. doi:10.1093/jos/17.2.119.
van Rooij, Robert. 2006. Free choice counterfactual donkeys. Journal of
Semantics 23(4). 383–402. doi:10.1093/jos/ffl004.
van Rooij, Robert & Katrin Schulz. 2004. Exhaustive interpretation of complex
sentences. Journal of Logic, Language, and Information 13(4). 491–519.
doi:10.1007/s10849-004-2118-6.
Sæbø, Kjell Johan. 2004. Optimal interpretations of permission sentences.
In Rusudan Asatiani, Kata Balogh, Dick de Jongh, George Chikoize &
Paul Dekker (eds.), The Fifth Tbilisi Symposium on Language, Logic and
Computation (TbiLLC 2003), 137–144. Amsterdam and Tiblisi: ILLC/CLLS.
Sauerland, Uli. 2004. Scalar implicatures of complex sentences. Linguistics
11:27
Robert van Rooij
Schulz, Katrin. 2003. You may read it now or later: A case study on the
paradox of free choice permission. Amsterdam: University of Amster-
dam MA thesis. http://www.illc.uva.nl/Publications/ResearchReports/
MoL-2004-01.text.pdf.
permission. Synthese: Knowledge, Rationality and Action 147(2). 343–377.
doi:10.1007/s11229-005-1353-y.
Schulz, Katrin & Robert van Rooij. 2006. Pragmatic meaning and non-
monotonic reasoning: The case of exhaustive interpretation. Linguistics
and Philosophy 29(2). 205–250. doi:10.1007/s10988-005-3760-4.
Spector, Benjamin. 2003. Scalar implicatures: Exhaustivity and Gricean
reasoning. In Balder ten Cate (ed.), Eighth ESSLLI Student Session (European
Summer School in Logic, Language and Information), 277–288. Vienna.
http://www.cs.ucsc.edu/~btencate/esslli03/stus2003proc.pdf.
Spector, Benjamin. 2006. Aspects de la pragmatique des operateurs logiques.
Paris: University of Paris VII dissertation. http://cognition.ens.fr/
~bspector/THESE_SPECTOR/THESE_SPECTOR_AVEC_ANNEXE2.pdf.
von Wright, G. H. 1950. Deontic logic. Mind 60(237). 1–15.
doi:10.1093/mind/LX.237.1.
doi:10.1023/A:1011255819284.
Robert van Rooij

Nieuwe Doelenstraat 15
1015 CP Amsterdam
Amsterdam
the Netherlands
R.a.m.vanRooij@uva.nl
11:28

(Linguistic Society of America) Semantics Pragma PDF

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

(Linguistic Society of America) Semantics Pragma PDF

Caricato da

Copyright:

Formati disponibili

Semantics & Pragmatics Volume 3, Article 1: 1–72, 2010

Received 2009-01-13 / First Decision 2009-03-17 / Revised 2009-06-17 / Second

Keywords: comparatives, degrees, intervals, quantifiers, indefinites, plurals, scope

©2010 Sigrid Beck

The problem of quantifiers in than-clauses has been puzzling linguists for a

(1) John ran faster than every girl did.

I first present a sample of data that I take to be representative of the inter-

2.1 The empirical picture

2.1.1 A classical analysis of the comparative

(3) a. Paule is older than Knut is.

Importantly, the role of the comparative operator is ultimately to relate

(1997)), but the problem of quantifiers in than-clauses presents itself in a

(5) a. We bought [what we liked].

2.1.2 Apparent wide scope quantifiers

(6) John is taller than every girl is.

g1 ’s height J’s height g2 ’s height g3 ’s height

The classical semantics of comparatives makes this look as if the NP had

(600 ) a. [[every girl] [1 [[-er [d than max 2 [t1 is t2 tall]]

(7) a. John is 200 taller than every girl is.

(8) John is taller than I had predicted (that he would be).

“John’s actual height exceeds the degree of tallness which he has

2.1.3 Apparent narrow scope quantifiers

Not all quantificational elements show this behaviour. A universal quantifier

(11) Mary is taller than she has to be.

These modals permit what appears to be a narrow scope interpretation

wide scope reading of have to (see Section 3 for more discussion).

Existential modals like be allowed also appear to take narrow scope:

(14) Mary is taller than she is allowed to be.

And so do some other existential quantifiers and disjunction:

(16) Mary is taller than anyone else is.

(20) *John is taller than no girl is.

(21) a. John’s height exceeds the maximum height reached by no girl.

(6000 ) a. ?John is taller than every girl is.

2.2 New analyses I

2.2.1 Schwarzschild & Wilkinson 2002

Schwarzschild & Wilkinson (2002) are inspired by the scope puzzle to a

(23) Caroline is taller than everyone else is.

 interval that covers everyone else’s height 

(the interval is related to Caroline’s height by the comparative)

(25) Joe is taller than exactly 5 people are.

Here is a rough sketch of Schwarzschild & Wilkinson’s analysis of this exam-

(26) Subord: [λD. exactly 5 people’s height falls within D]

(27) John is taller than anyone else is.

2.2.2 Heim 2006b

(29) John is taller than every girl is.

(31) Pi = λD.λP . max(P ) ∈ D

λD 0 . ∀x[girl(x) → [λD.λP . max(P ) ∈ D](D 0 )(λd. Height(x) ≥

∀x[girl(x) → Height(x) ∈ (λd. John is taller than d)] =

(33) a. John is taller than I had predicted (that he would be).

The effect of the Pi operator on the predicate of degrees it combines with

(36) Pi shifts from degrees to intervals:

In contrast to Schwarzschild & Wilkinson’s original interval analysis, Heim

(37) a. Mary is taller than she has to be.

Other apparent narrow scope operators receive a parallel analysis. The

(40) Pi takes narrow scope relative to quantifier

Thus than-clauses include a shift from degrees to intervals, which allows

(41) a. *John is taller than no girl is.

Adopting the interval analysis, but combining it with a scope mechanism

2.3 Alternative new analyses: Gajewski, van Rooij, Schwarzschild

2.3.1 Seuren’s semantics for the comparative (operator: NOT)

Seuren (1978) suggests (43b) as the interpretation of (43a). The than-clause

(43) a. John is taller than Bill is.

(44) a. than λd[NOT Bill is d-tall]

interval that covers everyone else’s height

(31) Pi = λD.λP . max(P ) ∈ D

(57) a. John is not allowed to be that tall. NOT allowed

(61) a. John was not predicted to be that tall. NOT predict — #

(80) tall = λD. λx. Height(x) ∈ D

(81) -er = λdd . λd0d . d0 > d

(79000 ) John is taller than max(m_inf(than-clause))